Dangers of the World Wide Web: Cross-Protocol Attacks
Copyright (C) 2002-2003 Jeff Connelly
2002-12-18 - 2003-01-05
Preliminary Outline
(note: the content does not match the outline just yet)
1. World Wide Web Browser: The Vehicle of Attack
2. Linking: links *will* be followed
2.1. Types of Links
2.2. Obfuscating Links
3. General "Web" Vulnerabilities
3.1. DDoS - deny data
3.2. AFSC - submit data
3.3. Caches - store/retreive data
3.4. XSS - steal data
4. HTTP Request Structure
4.1. Headers
4.2. GET
4.3. POST
4.4. PUT
4.5. HEAD
4.6. TRACE
5. Gopher Request Structure (!!!)
6. FTP Login Sessions
7. Crossing The Protocols
7.1 HTTP-IRC
7.2 HTTP-Telnet
7.3 Gopher-anything
A. The Politics of Linking
NOTE: Cross-Protocol Scripting exists: kb.cert.org/vulns/id/476267, search for
HTTP Form Protocol Attack. already known.
Before HTTP, there was Usenet. When people talked about "The Internet",
in common speech, it was understood he or she was referring to Usenet [1].
Nowadays, "The Internet" has became the World Wide Web as the HTTP protocol
has became more prevalent.
With HTTP came a novel idea: server-to-server "hyperlinks" embedded within
documents identified as "hypertext", thus, the Hypertext Transfer Protocol [2]
was born, and on top of it, the World Wide Web.
Web Links
---------
The nature of hyperlinking (hereinafter linking) is of course voluntary, but
definitely required to experience the full essense of the World Wide Web. For
this reason, it can be assumed once a link is planted and advertised, it will
be religiously followed by curious users, search engine spiders, and useless
(previously) spam harvesters as well.
The following of a link by a user agent causes the user agent to connect to
and send data to any host on the Internet. This is what this document details
on how to exploit.
Spammers have caught on to a similar method, according to
http://www.kuro5hin.org/comments/2004/4/9/232340/1998/2#2 :
"OK, the Referer spam works like this: A [...] site will embed a tiny,
1x1 pixel IMG tag with your front page listed as the SRC attribute. So when
people hit the [...] site, you magically get hits from many different IP
addresses with the [...] referer.
A lot of the sites rotate the image URL used on every click, so that their
address is spammed far and wide (you can check by visiting the links from your
log files if you like)."
Types of Links
--------------
1. Hyperlink: element
Hit if:
* manually followed by user
* spidered by robot
2. Image:
Hit if:
* user agent has images enabled (likely)
* spidered by an image robot
3. Hidden Frame:
Hit if:
* user agent is frames-capable (likely)
* might not be spidered by unintelligent robots
Other URL referencing tags:
tag name human user computer visitor
clicked spidered for text
Java-enabled ---
image-enabled spidered for images
frames-enabled spidered for text
embed-enabled ---
...will submit a POST similar to the following:
POST /path HTTP/1.1
Header-N: Value-N
name1=value1 (url-encoded as necessary)
This isn't looking any more useful than GET. The real fun begins when
an enctype="multipart/form-data" is used (see RFC 1867). Although
intended for form-based file-upload, the multipart/form-data MIME type
works with all fields in all current browsers. Quoteth RFC 1867's example:
Content-type: multipart/form-data, boundary=AaB03x
--AaB03x [this is an arbitrary delimiter picked by the browser]
content-disposition: form-data; name="field1"
Joe Blow [this is user-defined data]
--AaB03x
content-disposition: form-data; name="pics"; filename="file1.txt"
Content-Type: text/plain
... contents of file1.txt ...
--AaB03x--
The real strength lies in the fact that file1.txt is sent in binary mode,
with no messy URL-encoding schemes or base64 encoding. Once the HTTP headers
are sent, its fair game -- we can send *any* byte we wish. Protocols which
ignore or gracefully reject invalid commands are particularly exploitable,
specifically line-based protocols.
The HTTP "PUT" Method
---------------------
There exists a little-known "PUT" method in the HTTP specification, which
allows for uploading of data like this:
PUT /path HTTP/1.1
PUT is much simpler than POST, but less accessible and more difficult to
code for to request. Peter A. Bromberg, Ph.D. [4] has written the following
Visual Basic script to send a PUT from Internet explorer:
---begin code--
--end code--
The "TRACE" Request
-------------------
As discussed on BugTraq. Microsoft.XMLHTTP. WhiteHat Security.
http://archives.neohapsis.com/archives/vulnwatch/2003-q1/0034.html
http://www.betanews.com/whitehat/WH-WhitePaper_XST_ebook.pdf
http://www.cgisecurity.com/whitehat-mirror/WhitePaper_screen.pdf
Cross-Site Tracing. Defeats Microsoft's ;httpOnly cookie XSS protection.
--start file--
--end file--
Servers receiving TRACE return what was sent by the client, including cookie
headers and authorization. Even over SSL. Works only with XSS.
Mozilla XMLDOM object scripting? Other ActiveX controls of HTTP?
Gopher Request Structure
-----------------------
Gopher is trivially exploitable. A request, for example
gopher://example.com/0foobar will connect to example.com, port 70 and send:
foobar
The user agent will wait for a "." to be sent by the server, then close the
connection. "0" specifies how the browser shall display the received data,
"9" is for binary mode where the server doesn't close after "." on a line by
itself is received.
TODO: Embedding HTML documents?
FTP Login Structure
-------------------
FTP is too complicated to use as an attack vector. Maybe not. The data
connection could be anything.
Passive FTP: client connects to open port on server, server chooses port (PASV)
Active FTP: server connects to open port on client, client chooses port (PORT)
TODO: SMTP (find original "SMTP is like HTTP and can be used to send emails")
HTTP-HTTP Attacks
-----------------
Before delving into cross-protocol attacks it may be instructive to outline
some (perhaps obvious, but real) vulnerabilities with the protocol itself.
1) DDoS - deny access to data
2) AFSC - submit data
3) Caches - store/retrieve data
4) XSS - steal data
5) AFSC+XSS - web proxy (flow data)
1) Stupid DDoS Attack
In late October, Jim Hunt reported the following to the SecurityFocus Incidents
mailing list [5]:
I have a friend that has a DOS Attack going on against their website. It is
being done by someone with a very popular website trying to squash a
little guy. He is doing it be placing 1 pixel by 1 pixel inline frames in
his webpages and having them load my friends webpage. It is killing his
server and bandwidth.
What can we do to block? The Server is W2K with IIS.
Thanks!
The attack is obvious: on a high traffic site, place many 's to that of
the victim. An Apache module, mod_dosevasive, has been written to limit
repeated requests for the same file from the same host in a certain amount of
time [6]. Slashdot user "krappie" #172561 has posted some other solutions:
I work as tech support for a webhosting company. I see things like this
all the time. People tend to think its impossible to block because its
not from any one specific ip address, but the requests are coming from
all over. People need to learn the awesome power of mod_rewrite.
RewriteEngine on
RewriteCond %{HTTP_REFERER} ^http://(.+\.)*bigguysite.com/ [NC]
RewriteRule /* - [F]
I've also seen people who had bad domain names pointed at their ips, where
you can check the HTTP_HOST. I've seen recursive download programs totally
crush webservers, mod_rewrite can check the HTTP_USER_AGENT for that.
Of course, download programs could always change the specified user agent,
which is I guess where this apache module could come in handy. Good idea..
Since our attack vehicle is an ordinary web browser, filtering the User-Agent
headers does not apply. Referer [sic] is relevant, but blocking it will only
block one instance of the redirecting page. Given the prevalence of free web
hosts [freewebspace.net], referrer blocking can have limited effectiveness.
Defeating The Referer: use immediate data, in Mozilla:
data:text/html,
In MSIE pre-SP1:
about:
Beginning with IE6SP1, the 'about' URLs no longer work, so don't even try it.
Note that the "data" scheme is IETF standard, and more flexible. Either way,
the referrer value is meaningless in the above situations.
2) Inconspicious Form Submission - AFSC
Web forms have plenty of real-world applications. Posting messages on a forum,
signing guestbooks, and so on.
Automatic form submissions, whether GET or POST, can be automated by the
click of a link. Forms may submit to message boards or guestbooks quite
anonymously from the perspective of who set up the redirection page.
The Anonymous Form Submission Center (AFSC) is one such realization of the
power of redirecting HTTP to arbitrary GET requests or JavaScript-invoked POST
submissions; we believe AFSC to be the first technology to intelligently
utilize the bandwidth of unwanted spam harvesters.
DEFENSE 1: Use POST instead of GET, attacker cannot simply use a link. However,
a web form with automatic submission will work if JavaScript is enabled.
DEFENSE 2 (best): This is what Slashdot does. You have a FORMKEY field,
randomly generated by the server; the server checks its validity upon
submission. Best solution. Submission is still possible if XSS vulnerabilities
exist on the current browser or site, however.
3) Web Caches (Squid, etc.)
Store your data by sending documents with very high cache expire dates, then
retreive them later. Not covered in this document, but interested readers
may also like to know about using DNS to distribute data, Google for
"distributing DeCSS via DNS" [decss.zoy.org]
Heres how you can distribute your data via caching proxies
[tools.rosinstrument.com/proxy/], send an Expires header with a date of one
year, so the cache will cache as long as possible (this is in the 1.1 RFC).
Then, pass out the following information:
a) Proxy host
b) Proxy port
c) Temporary host(+port)
d) Temporary pathname
The temporary host only need be up for one transfer of the pathname through the
proxy, after which all requests through the proxy will be cached and not sent.
The temporary host shall not be followed without going through the proxy.
Perhaps a coded scheme for encoding proxy/temporary hosts can be as follows:
chttp://cachehost:cacheport/temp_path/temp_host
Since the temporary host is not important...it perhaps may be obfuscated, ROT13
or reversed or some substitution cipher to prevent accidential connection.
Many cache hosts/ports could be packed in binary and encoded with base64 for
added redundancy; additionally, the temporary host (dynamic DNS) could be
deleted or rerouted to an invalid or alternative address before publishing the
access information. Anonymity. (Warning: DNS caches).
4) Cross-Site Scripting (XSS)
XSS is "unforseeable interaction" between components of a web browser,
more specifically scripting is used to steal cookies, user data, or other
content.
Unpatched IE vulnerabilities:
http://www.pivx.com/larholm/unpatched/
Very nice. Make a list of vulns and what they have been patched by to identify.
5) XSS+AFSC = Web Proxy
Normally AFSC is only one way; when "data submission" is combined with
"data stealing", the results of the submission can often be obtained thus
allowing for a full-on web proxy.
Example HTTP Requests
---------------------
* (before user-defined submission)
Galeon 1.2.6 8 9 10
Mozilla 5.0 8 9 10
MSIE 6.0 on XP 6 7 8
Links 2.1 6 7 8
Lynx 2.8.4.1 5 6 7
* add 1 to count the HTTP request line
* add 1 to count the blank line after the request
When cross-HTTP attacking some services, the number of lines that come
before the POST or PUT data submission is important. JavaScript can be used
to detect the user-agent and forward the zombie to the appropriate page.
[Mozilla 5.0 under Galeon 1.2.6 on FreeBSD]
=1= GET / HTTP/1.1
=2= Host: localhost:7777
=3= User-Agent: Mozilla/5.0 Galeon/1.2.6 (X11; FreeBSD i386; U;) Gecko/0
=4= Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,text/css,*/*;q=0.1
=5= Accept-Language: en
=6= Accept-Encoding: gzip, deflate, compress;q=0.9
=7= Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66
=8= Keep-Alive: 300
=9= Connection: keep-alive
[MSIE 6.0 on Windows XP]
=1= GET / HTTP/1.1
=2= Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/vnd.ms-powerpoint, application/msword, */*
=3= Accept-Language: en-us
=4= Accept-Encoding: gzip, deflate
=5= User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
=6= Host: 10.0.0.2:10000
=7= Connection: Keep-Alive
[Links 2.1pre2 under FreeBSD]
=1= GET / HTTP/1.1
=2= Host: localhost:7777
=3= User-Agent: Links (2.1pre2; FreeBSD 4.7-STABLE i386; 80x24)
=4= Accept: */*
=6= Accept-Charset: us-ascii, ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5, ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16, windows-1250, windows-1251, windows-1252, windows-1256, windows-1257, cp437, cp737, cp850, cp852, cp866, x-cp866-u, x-mac, x-mac-ce, x-kam-cs, koi8-r, koi8-u, TCVN-5712, VISCII, utf-8
=7= Connection: Keep-Alive
[Lynx 2.8.4rel.1 under FreeBSD]
=1= GET / HTTP/1.0
=2= Host: localhost:7777
=3= Accept: text/html, text/plain, text/sgml, video/mpeg, image/jpeg, image/tiff, image/x-rgb, image/png, image/x-xbitmap, image/x-xbm, image/gif, application/postscript, */*;q=0.01
=4= Accept-Encoding: gzip, compress
=5= Accept-Language: en
=6= User-Agent: Lynx/2.8.4rel.1 libwww-FM/2.14
[Mozilla 5.0 under FreeBSD]
=1= GET / HTTP/1.1
=2= Host: localhost:7777
=3= User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.1) Gecko/20021205
=4= Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,video/x-mng,image/png,image/jpeg,image/gif;q=0.2,text/css,*/*;q=0.1
=5= Accept-Language: en-us, en;q=0.50
=6= Accept-Encoding: gzip, deflate, compress;q=0.9
=7= Accept-Charset: ISO-8859-1, utf-8;q=0.66, *;q=0.66
=8= Keep-Alive: 300
=9= Connection: keep-alive
Cross-Protocol Vulnerabilities: An Introduction
-----------------------------------------------
Now that HTTP has been thoroughly explained, we shall see what we can do with
it. The essense of cross-protocol attacking is to instruct the client to
connect to (by means of a link) a different service on a host and have it send
an HTTP request, which will be interpreted by the other service in its own way.
In other words, the idea is to construct a polyglot: one communication which
makes sense in two languages (HTTP (usually) and another protocol).
URLs allow for port number specification, for example the URL
http://localhost:23/ would connect to the (obsolete) Telnet remote login
service on the local machine. If in control of a black hat, the client would
most likely be told to brute-force a login password.
Connecting to port 23 in a web browser is so dangerous Mozilla restricts
certain port numbers:
Mozilla, gopher: 1 7 9 11 13 15 17 19 20 21 22 23 25
Internet Explorer, in my testing, has not restricted anything.
However, even with Mozilla's blocks there are interesting services running
on higher port numbers. These cannot be justifiably blocked--as HTTP proxies
and even some web servers run on high ports such 8080. Blocking all ports
but 80 in web browsers will severely limit the flexability of user agents.
HTTP-IRC Attacks, or: Internet Relay Chat, a Line-Based Protocol
------------------------------------------
IRC [7] is a possible avenue of attack. Like HTTP, it is line-based.
IRC servers have a tendency to run on >1024 ports, commonly in the vicinity
of 6667, although some as high as 8080 or 31337 so a browser will have no
guilt in connecting to it.
If an IRC server receives nonsense, it will ignore it and take no action.
The first few lines of an HTTP POST are jibberish when interpreted as IRC,
but IRC servers blissfully ignore and continue until the body of the form is
encountered.
Yup, any IRC commands can be placed in the form body. A mandatory
USER and NICK is entered followed by JOIN or PRIVMSG commands, perhaps
advertising the URL to the page which carries out this attack to create
a viral method of distribution.
When an IRC server is sent data, the IRC server's response will often be
displayed in web browsers. As plain text, which can't contain commands.
However, the HTML specification requires the first tag in a hypertext
document be ; non-HTML content may preceed and follow .
This means all we need is an somewhere in the data stream received
by the server; the server needs to send back . The NICK command
is appropriate for this purpose:
Client: NICK
Server: XXX Invalid nickname
Hence, has been received from the server; and what follows may contain
HTML tags which let us control the browser at our whim. Text received from
PRIVMSG's sent in channels and privately will be interpreted as HTML, allowing
for Ok!
RETR /pub/unix/MD5.tar.Z.asc
QUIT
Here, an FTP server with certain qualities had to be discovered. See
ftp-js-thread.txt for more details until I put it here.
Gopher
-----
Gopher runs on port 70 having the protcol scheme gopher:, and having despite
being not used commonly for years most popular browsers support it. IE, however
has had a bug discovered in Gopher and rather than fixing it encourage users
to disable Gopher support. So exploitable hosts are limited, but they are there.
gopher://host:port/name
This is the most easist to use to send arbitrary data. Given the URL above,
being a one-character numeral or letter, "name" will be sent to
"host". URL-decoded. So arbitrary bytes can be sent to any host. Shocking.
Item type characters are defined in RFC 1436:
0 Item is a file
1 Item is a directory
2 Item is a CSO phone-book server
3 Error
4 Item is a BinHexed Macintosh file.
5 Item is DOS binary archive of some sort.
Client must read until the TCP connection closes. Beware.
6 Item is a UNIX uuencoded file.
7 Item is an Index-Search server.
8 Item points to a text-based telnet session.
9 Item is a binary file!
Client must read until the TCP connection closes. Beware.
+ Item is a redundant server
T Item points to a text-based tn3270 session.
g Item is a GIF format graphics file.
I Item is some kind of image file. Client decides how to display.
Normally, the server responds with a "." on a line at the end of file, but
not with the binary item types 5 and 9. Type 0 is just a regular text file,
the client knows which MIME type by the file extension--a .html extension
will allow for