Dangers of the World Wide Web: Cross-Protocol Attacks Copyright (C) 2002-2003 Jeff Connelly 2002-12-18 - 2003-01-05 Preliminary Outline (note: the content does not match the outline just yet) 1. World Wide Web Browser: The Vehicle of Attack 2. Linking: links *will* be followed 2.1. Types of Links 2.2. Obfuscating Links 3. General "Web" Vulnerabilities 3.1. DDoS - deny data 3.2. AFSC - submit data 3.3. Caches - store/retreive data 3.4. XSS - steal data 4. HTTP Request Structure 4.1. Headers 4.2. GET 4.3. POST 4.4. PUT 4.5. HEAD 4.6. TRACE 5. Gopher Request Structure (!!!) 6. FTP Login Sessions 7. Crossing The Protocols 7.1 HTTP-IRC 7.2 HTTP-Telnet 7.3 Gopher-anything A. The Politics of Linking NOTE: Cross-Protocol Scripting exists: kb.cert.org/vulns/id/476267, search for HTTP Form Protocol Attack. already known. Before HTTP, there was Usenet. When people talked about "The Internet", in common speech, it was understood he or she was referring to Usenet [1]. Nowadays, "The Internet" has became the World Wide Web as the HTTP protocol has became more prevalent. With HTTP came a novel idea: server-to-server "hyperlinks" embedded within documents identified as "hypertext", thus, the Hypertext Transfer Protocol [2] was born, and on top of it, the World Wide Web. Web Links --------- The nature of hyperlinking (hereinafter linking) is of course voluntary, but definitely required to experience the full essense of the World Wide Web. For this reason, it can be assumed once a link is planted and advertised, it will be religiously followed by curious users, search engine spiders, and useless (previously) spam harvesters as well. The following of a link by a user agent causes the user agent to connect to and send data to any host on the Internet. This is what this document details on how to exploit. Spammers have caught on to a similar method, according to http://www.kuro5hin.org/comments/2004/4/9/232340/1998/2#2 : "OK, the Referer spam works like this: A [...] site will embed a tiny, 1x1 pixel IMG tag with your front page listed as the SRC attribute. So when people hit the [...] site, you magically get hits from many different IP addresses with the [...] referer. A lot of the sites rotate the image URL used on every click, so that their address is spammed far and wide (you can check by visiting the links from your log files if you like)." Types of Links -------------- 1. Hyperlink: element Hit if: * manually followed by user * spidered by robot 2. Image: Hit if: * user agent has images enabled (likely) * spidered by an image robot 3. Hidden Frame: Hit if: * user agent is frames-capable (likely) * might not be spidered by unintelligent robots Other URL referencing tags: tag name human user computer visitor clicked spidered for text Java-enabled --- image-enabled spidered for images frames-enabled spidered for text embed-enabled ---
submitted, or JS-enabled autosubmit
...will submit a POST similar to the following: POST /path HTTP/1.1 Header-N: Value-N name1=value1 (url-encoded as necessary) This isn't looking any more useful than GET. The real fun begins when an enctype="multipart/form-data" is used (see RFC 1867). Although intended for form-based file-upload, the multipart/form-data MIME type works with all fields in all current browsers. Quoteth RFC 1867's example: Content-type: multipart/form-data, boundary=AaB03x --AaB03x [this is an arbitrary delimiter picked by the browser] content-disposition: form-data; name="field1" Joe Blow [this is user-defined data] --AaB03x content-disposition: form-data; name="pics"; filename="file1.txt" Content-Type: text/plain ... contents of file1.txt ... --AaB03x-- The real strength lies in the fact that file1.txt is sent in binary mode, with no messy URL-encoding schemes or base64 encoding. Once the HTTP headers are sent, its fair game -- we can send *any* byte we wish. Protocols which ignore or gracefully reject invalid commands are particularly exploitable, specifically line-based protocols. The HTTP "PUT" Method --------------------- There exists a little-known "PUT" method in the HTTP specification, which allows for uploading of data like this: PUT /path HTTP/1.1 PUT is much simpler than POST, but less accessible and more difficult to code for to request. Peter A. Bromberg, Ph.D. [4] has written the following Visual Basic script to send a PUT from Internet explorer: ---begin code--
Type Binary (Uncheck for Type Text)
--end code-- The "TRACE" Request ------------------- As discussed on BugTraq. Microsoft.XMLHTTP. WhiteHat Security. http://archives.neohapsis.com/archives/vulnwatch/2003-q1/0034.html http://www.betanews.com/whitehat/WH-WhitePaper_XST_ebook.pdf http://www.cgisecurity.com/whitehat-mirror/WhitePaper_screen.pdf Cross-Site Tracing. Defeats Microsoft's ;httpOnly cookie XSS protection. --start file-- --end file-- Servers receiving TRACE return what was sent by the client, including cookie headers and authorization. Even over SSL. Works only with XSS. Mozilla XMLDOM object scripting? Other ActiveX controls of HTTP? Gopher Request Structure ----------------------- Gopher is trivially exploitable. A request, for example gopher://example.com/0foobar will connect to example.com, port 70 and send: foobar The user agent will wait for a "." to be sent by the server, then close the connection. "0" specifies how the browser shall display the received data, "9" is for binary mode where the server doesn't close after "." on a line by itself is received. TODO: Embedding HTML documents? FTP Login Structure ------------------- FTP is too complicated to use as an attack vector. Maybe not. The data connection could be anything. Passive FTP: client connects to open port on server, server chooses port (PASV) Active FTP: server connects to open port on client, client chooses port (PORT) TODO: SMTP (find original "SMTP is like HTTP and can be used to send emails") HTTP-HTTP Attacks ----------------- Before delving into cross-protocol attacks it may be instructive to outline some (perhaps obvious, but real) vulnerabilities with the protocol itself. 1) DDoS - deny access to data 2) AFSC - submit data 3) Caches - store/retrieve data 4) XSS - steal data 5) AFSC+XSS - web proxy (flow data) 1) Stupid DDoS Attack In late October, Jim Hunt reported the following to the SecurityFocus Incidents mailing list [5]: I have a friend that has a DOS Attack going on against their website. It is being done by someone with a very popular website trying to squash a little guy. He is doing it be placing 1 pixel by 1 pixel inline frames in his webpages and having them load my friends webpage. It is killing his server and bandwidth. What can we do to block? The Server is W2K with IIS. Thanks! The attack is obvious: on a high traffic site, place many 's to that of the victim. An Apache module, mod_dosevasive, has been written to limit repeated requests for the same file from the same host in a certain amount of time [6]. Slashdot user "krappie" #172561 has posted some other solutions: I work as tech support for a webhosting company. I see things like this all the time. People tend to think its impossible to block because its not from any one specific ip address, but the requests are coming from all over. People need to learn the awesome power of mod_rewrite. RewriteEngine on RewriteCond %{HTTP_REFERER} ^http://(.+\.)*bigguysite.com/ [NC] RewriteRule /* - [F] I've also seen people who had bad domain names pointed at their ips, where you can check the HTTP_HOST. I've seen recursive download programs totally crush webservers, mod_rewrite can check the HTTP_USER_AGENT for that. Of course, download programs could always change the specified user agent, which is I guess where this apache module could come in handy. Good idea.. Since our attack vehicle is an ordinary web browser, filtering the User-Agent headers does not apply. Referer [sic] is relevant, but blocking it will only block one instance of the redirecting page. Given the prevalence of free web hosts [freewebspace.net], referrer blocking can have limited effectiveness. Defeating The Referer: use immediate data, in Mozilla: data:text/html, In MSIE pre-SP1: about: Beginning with IE6SP1, the 'about' URLs no longer work, so don't even try it. Note that the "data" scheme is IETF standard, and more flexible. Either way, the referrer value is meaningless in the above situations. 2) Inconspicious Form Submission - AFSC Web forms have plenty of real-world applications. Posting messages on a forum, signing guestbooks, and so on. Automatic form submissions, whether GET or POST, can be automated by the click of a link. Forms may submit to message boards or guestbooks quite anonymously from the perspective of who set up the redirection page. The Anonymous Form Submission Center (AFSC) is one such realization of the power of redirecting HTTP to arbitrary GET requests or JavaScript-invoked POST submissions; we believe AFSC to be the first technology to intelligently utilize the bandwidth of unwanted spam harvesters. DEFENSE 1: Use POST instead of GET, attacker cannot simply use a link. However, a web form with automatic submission will work if JavaScript is enabled. DEFENSE 2 (best): This is what Slashdot does. You have a FORMKEY field, randomly generated by the server; the server checks its validity upon submission. Best solution. Submission is still possible if XSS vulnerabilities exist on the current browser or site, however. 3) Web Caches (Squid, etc.) Store your data by sending documents with very high cache expire dates, then retreive them later. Not covered in this document, but interested readers may also like to know about using DNS to distribute data, Google for "distributing DeCSS via DNS" [decss.zoy.org] Heres how you can distribute your data via caching proxies [tools.rosinstrument.com/proxy/], send an Expires header with a date of one year, so the cache will cache as long as possible (this is in the 1.1 RFC). Then, pass out the following information: a) Proxy host b) Proxy port c) Temporary host(+port) d) Temporary pathname The temporary host only need be up for one transfer of the pathname through the proxy, after which all requests through the proxy will be cached and not sent. The temporary host shall not be followed without going through the proxy. Perhaps a coded scheme for encoding proxy/temporary hosts can be as follows: chttp://cachehost:cacheport/temp_path/temp_host Since the temporary host is not important...it perhaps may be obfuscated, ROT13 or reversed or some substitution cipher to prevent accidential connection. Many cache hosts/ports could be packed in binary and encoded with base64 for added redundancy; additionally, the temporary host (dynamic DNS) could be deleted or rerouted to an invalid or alternative address before publishing the access information. Anonymity. (Warning: DNS caches). 4) Cross-Site Scripting (XSS) XSS is "unforseeable interaction" between components of a web browser, more specifically scripting is used to steal cookies, user data, or other content. Unpatched IE vulnerabilities: http://www.pivx.com/larholm/unpatched/ Very nice. Make a list of vulns and what they have been patched by to identify. 5) XSS+AFSC = Web Proxy Normally AFSC is only one way; when "data submission" is combined with "data stealing", the results of the submission can often be obtained thus allowing for a full-on web proxy. Example HTTP Requests --------------------- * (before user-defined submission) Galeon 1.2.6 8 9 10 Mozilla 5.0 8 9 10 MSIE 6.0 on XP 6 7 8 Links 2.1 6 7 8 Lynx 2.8.4.1 5 6 7 * add 1 to count the HTTP request line * add 1 to count the blank line after the request When cross-HTTP attacking some services, the number of lines that come before the POST or PUT data submission is important. JavaScript can be used to detect the user-agent and forward the zombie to the appropriate page. [Mozilla 5.0 under Galeon 1.2.6 on FreeBSD] =1= GET / HTTP/1.1 =2= Host: localhost:7777 =3= User-Agent: Mozilla/5.0 Galeon/1.2.6 (X11; FreeBSD i386; U;) Gecko/0 =4= Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,text/css,*/*;q=0.1 =5= Accept-Language: en =6= Accept-Encoding: gzip, deflate, compress;q=0.9 =7= Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66 =8= Keep-Alive: 300 =9= Connection: keep-alive [MSIE 6.0 on Windows XP] =1= GET / HTTP/1.1 =2= Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */* =3= Accept-Language: en-us =4= Accept-Encoding: gzip, deflate =5= User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) =6= Host: 10.0.0.2:10000 =7= Connection: Keep-Alive [Links 2.1pre2 under FreeBSD] =1= GET / HTTP/1.1 =2= Host: localhost:7777 =3= User-Agent: Links (2.1pre2; FreeBSD 4.7-STABLE i386; 80x24) =4= Accept: */* =6= Accept-Charset: us-ascii, ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5, ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16, windows-1250, windows-1251, windows-1252, windows-1256, windows-1257, cp437, cp737, cp850, cp852, cp866, x-cp866-u, x-mac, x-mac-ce, x-kam-cs, koi8-r, koi8-u, TCVN-5712, VISCII, utf-8 =7= Connection: Keep-Alive [Lynx 2.8.4rel.1 under FreeBSD] =1= GET / HTTP/1.0 =2= Host: localhost:7777 =3= Accept: text/html, text/plain, text/sgml, video/mpeg, image/jpeg, image/tiff, image/x-rgb, image/png, image/x-xbitmap, image/x-xbm, image/gif, application/postscript, */*;q=0.01 =4= Accept-Encoding: gzip, compress =5= Accept-Language: en =6= User-Agent: Lynx/2.8.4rel.1 libwww-FM/2.14 [Mozilla 5.0 under FreeBSD] =1= GET / HTTP/1.1 =2= Host: localhost:7777 =3= User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.1) Gecko/20021205 =4= Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,text/css,*/*;q=0.1 =5= Accept-Language: en-us, en;q=0.50 =6= Accept-Encoding: gzip, deflate, compress;q=0.9 =7= Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66 =8= Keep-Alive: 300 =9= Connection: keep-alive Cross-Protocol Vulnerabilities: An Introduction ----------------------------------------------- Now that HTTP has been thoroughly explained, we shall see what we can do with it. The essense of cross-protocol attacking is to instruct the client to connect to (by means of a link) a different service on a host and have it send an HTTP request, which will be interpreted by the other service in its own way. In other words, the idea is to construct a polyglot: one communication which makes sense in two languages (HTTP (usually) and another protocol). URLs allow for port number specification, for example the URL http://localhost:23/ would connect to the (obsolete) Telnet remote login service on the local machine. If in control of a black hat, the client would most likely be told to brute-force a login password. Connecting to port 23 in a web browser is so dangerous Mozilla restricts certain port numbers: Mozilla, gopher: 1 7 9 11 13 15 17 19 20 21 22 23 25 Internet Explorer, in my testing, has not restricted anything. However, even with Mozilla's blocks there are interesting services running on higher port numbers. These cannot be justifiably blocked--as HTTP proxies and even some web servers run on high ports such 8080. Blocking all ports but 80 in web browsers will severely limit the flexability of user agents. HTTP-IRC Attacks, or: Internet Relay Chat, a Line-Based Protocol ------------------------------------------ IRC [7] is a possible avenue of attack. Like HTTP, it is line-based. IRC servers have a tendency to run on >1024 ports, commonly in the vicinity of 6667, although some as high as 8080 or 31337 so a browser will have no guilt in connecting to it. If an IRC server receives nonsense, it will ignore it and take no action. The first few lines of an HTTP POST are jibberish when interpreted as IRC, but IRC servers blissfully ignore and continue until the body of the form is encountered. Yup, any IRC commands can be placed in the form body. A mandatory USER and NICK is entered followed by JOIN or PRIVMSG commands, perhaps advertising the URL to the page which carries out this attack to create a viral method of distribution. When an IRC server is sent data, the IRC server's response will often be displayed in web browsers. As plain text, which can't contain commands. However, the HTML specification requires the first tag in a hypertext document be ; non-HTML content may preceed and follow . This means all we need is an somewhere in the data stream received by the server; the server needs to send back . The NICK command is appropriate for this purpose: Client: NICK Server: XXX Invalid nickname Hence, has been received from the server; and what follows may contain HTML tags which let us control the browser at our whim. Text received from PRIVMSG's sent in channels and privately will be interpreted as HTML, allowing for Ok! RETR /pub/unix/MD5.tar.Z.asc QUIT Here, an FTP server with certain qualities had to be discovered. See ftp-js-thread.txt for more details until I put it here. Gopher ----- Gopher runs on port 70 having the protcol scheme gopher:, and having despite being not used commonly for years most popular browsers support it. IE, however has had a bug discovered in Gopher and rather than fixing it encourage users to disable Gopher support. So exploitable hosts are limited, but they are there. gopher://host:port/name This is the most easist to use to send arbitrary data. Given the URL above, being a one-character numeral or letter, "name" will be sent to "host". URL-decoded. So arbitrary bytes can be sent to any host. Shocking. Item type characters are defined in RFC 1436: 0 Item is a file 1 Item is a directory 2 Item is a CSO phone-book server 3 Error 4 Item is a BinHexed Macintosh file. 5 Item is DOS binary archive of some sort. Client must read until the TCP connection closes. Beware. 6 Item is a UNIX uuencoded file. 7 Item is an Index-Search server. 8 Item points to a text-based telnet session. 9 Item is a binary file! Client must read until the TCP connection closes. Beware. + Item is a redundant server T Item points to a text-based tn3270 session. g Item is a GIF format graphics file. I Item is some kind of image file. Client decides how to display. Normally, the server responds with a "." on a line at the end of file, but not with the binary item types 5 and 9. Type 0 is just a regular text file, the client knows which MIME type by the file extension--a .html extension will allow for