Easy DOM-based XSS detection via Regexes

If you are interested in finding DOM-based XSS, you must have knowledge of http://code.google.com/p/domxsswiki/wiki/Introduction already. This is the best online resource about DOM-based XSS maintained by my friends Stefano di Paola and Mario Heiderich.

The wiki contains a deep explanation of:

  • all the potential sinks like location, cookie, CSS, eval-like calls etc..
  • the quirks between browsers regarding which characters are escaped in various locations such as after the hash fragment, location, path etc..
  • a particular focus on jQuery sinks like .wrap, .prepend, .gloablEval and so on..
  • last but not least, 2 regexes that help identifying source and sinks.

You know, I like to automate as much as possible during my day-to-day pentesting hacktivities, so I wrote a very basic Ruby script that applies the regexes created by Mario for you. If you was expecting a Python or Perl Script, don't bother to read the rest of the article :-D

Expect a second blog post in a few months about the integration of these regexes in Burp Pro. Dafydd now added support to instrument IBurpExtender directly from Ruby using JRuby, so this code can be easily added directly inside Burp, in order to be applied inline on every HTTP response.

require "net/http"
require "net/https"
require "uri"

MAIN_URL = 'https://target.com'
MAIN_DOMAIN = 'target.com'
PORT = 443

# from Burp, use "copy URLs in this branch", and replace the following array content accrodingly.

puts "[+] starting requests to #{MAIN_URL}"
p "==================================== Analyzing [#{path}] ===================================="
url = URI.parse(MAIN_URL + ':' + PORT.to_s + path)
http = Net::HTTP.new(url.host, url.port)
if url.scheme == "https"
http.use_ssl = true
http.verify_mode = OpenSSL::SSL::VERIFY_NONE # don't bother checking for valid SSL certificates.
req = Net::HTTP::Get.new(url.path)
http.request(req) do |res|
line = 1
response = res.body.to_s.split("\n")
# apply DOM-based xss regex to each HTTP response line, printing out lineNumber and lineContent
# that would potentially be vulnerable to DOM-based XSS
# Potential sinks and sources are enumerated here.
if(i.scan(/((src|href|data|location|code|value|action)\s*["'\]]*\s*\+?\s*=)|((replace|assign|navigate|getResponseHeader|open(Dialog)?|showModalDialog|eval|evaluate|execCommand|execScript|setTimeout|setInterval)\s*["'\]]*\s*\()/).size > 0 ||
i.scan(/(location\s*[\[.])|([.\[]\s*["']?\s*(arguments|dialogArguments|innerHTML|write(ln)?|open(Dialog)?|showModalDialog|cookie|URL|documentURI|baseURI|referrer|name|opener|parent|top|content|self|frames)\W)|(localStorage|sessionStorage|Database)/).size > 0)
p "[#{path}]-#{line}: #{i}"
line += 1

To use the Ruby script, right click on a resource branch in Burp's SiteMap (the one which contains JavaScript files) and copy them to a file. Remove every line that is not ending with .js, because Burp also exports URLs which are just folders.

Screen Shot 2013-02-19 at 12.15.53 PM

Now simply modify the PATHS_TO_TEST array with these URLs, adjust MAIN_URL, MAIN_DOMAIN and PORT accordingly to your target, and run the script. All the resources you want to analyse are downloaded and the regexes are run on the HTTP response contents.

The output will be something like:

"==================================== Analyzing [/xxx/script/PaymentAttemptList.js] ===================================="
"[/xxx/script/PaymentAttemptList.js]-107: oField.src = \"/xxx/pictures/expand.gif\";"
"[/xxx/script/PaymentAttemptList.js]-110: oField.src = \"/xxx/pictures/collapse.gif\";"
"[/xxx/script/PaymentAttemptList.js]-124: if ( oResponse ) { eval(oResponse); }"
"==================================== Analyzing [/xxx/script/sort/TableSorter.class.js] ===================================="
"[/xxx/script/sort/TableSorter.class.js]-126: window.setTimeout(function ()"
"[/xxx/script/sort/TableSorter.class.js]-438: var t = s.replace(/\\-/g, '/');"
"[/xxx/script/sort/TableSorter.class.js]-460: var t = s.replace(/[^0-9\\.-]/g, \"\");"

As you can see, there are things which are false positives, for example the first two lines of PaymentAttemptList.js: the variable assignments are static values.

Other things instead look interesting and deserve additional manual analysis, like where eval, setTimeout, or replace are used. The next step is opening all the JavaScript code in a proper IDE (if it's really complex), go to the matched line and start manual analysis tracking back all the function calls and variable assignments that lead to the sink.

Inline scripts are out of scope in this particular example. If you want to test a particular HTML page which contains a lot of inline JavaScript code, just add its URI to the PATHS_TO_TEST array.

Bear in mind the script will not be very effective when processing minified JavaScript, because I'm not beautifying it before running the regexes. You're free to customise it with this additional feature.

I will probably play with the new Burp extender Ruby API soon, and eventually port this code as a Burp extension. In the meantime, if you want a tool that does proper dynamic data tainting, meaning it can determine the exact sink of a DOM-based XSS vulnerability, you should use Stefano's DOMinator Pro.

Stay tuned.

Trustwave reserves the right to review all comments in the discussion below. Please note that for security and other reasons, we may not approve comments containing links.