Blogs & Stories

SpiderLabs Blog

Attracting more than a half-million annual readers, this is the security community's go-to destination for technical breakdowns of the latest threats, critical vulnerability disclosures and cutting-edge research.

Wait a minute... that’s not a real JPG!

When attackers compromise a website and want to harvest credit cards, they need to either find where the data is stored or capture the data in transit. This blog post shows how identifying files with false file signatures can uncover malicious activity on a server. I recently discovered credit card data hidden behind a .jpg extension that lead me to the work of an attacker capturing credit cards from customers using an online checkout page.

Below I detail how I discovered the attacker's methods and the methods themselves.

***Please note that the code and examples in this article have been recreated in a test environment. Any cardholder data, including names and credit card numbers, is fake.***

File Signatures

A file signature is small amount of data, usually at the top of a file, which identifies files of a particular type. For example, all JPG files should start with the following hexadecimal digits: 0xFF 0xD8 0xFF. When investigating a compromise, checking the file signatures for all files on a system is a simple, and quite often quick, way of identifying malicious files. If a file does not have a correct signature match it could be the work of an attacker hiding information.

The Investigation

On a recent investigation I found a file named 1.jpg, however, this was not a JPG file. On closer inspection, this file contained what appeared to be a mass of base64 encoded data. This was not a JPG file:


Upon closer inspection I noticed that each encoded string began with the common padding phrase "hea". Removing this phrase and decoding the data gave me the following output:


This was definitely not a JPG file! It was a file used by attackers to hide customer credit card details. Web server log files showed the attackers downloading cardholder data by downloading the 1.jpg file via a browser. But how did the card data end up saved to a fake JPG file in the first place?

Searching the victim system for 1.jpg led me to the following PHP file, called xml.php:


Since then, I have seen this file on multiple investigations. The file processes captured credit card data and writes it to a fake JPG file (1.jpg in this case). Xml.php also has a delete function. This function allows an attacker to erase all the records in 1.jpg to prevent the file from growing too large and to reduce the chance of detection.

After discovering where the stolen data was stored, next I had to identify how xml.php was gathering the credit card data. Further searching of the victim system for the path of xml.php led me to a malicious JavaScript modification.

The attacker had planted the following code inside one of the JavaScript files that was executed on every page of the checkout process.


This is an interesting piece of JavaScript. The first part of the function checks the form elements on a page to see if there is a field called "cvv" – the security code found on the back of a credit card. If this field is found, the JavaScript collects all of the data entered into the payment page form and sends it to the xml.php file for encoding and storing. The diagram below shows the whole transaction process:


From an investigative point-of-view, it was possible to identify this attack quickly by checking file signatures. Once I had located the credit card collection file, it was possible to work backwards and find the siphon in action.

As attackers become more and more creative with the methods that they use to hide their malicious activity, it is critical that the owners and administrators of online shops are aware of what exactly is occurring on their servers. The need for file integrity monitoring (FIM) is greater than ever. If an attacker modifies a website's source code, a FIM solution could alert administrators to a compromise in progress and help to limit the amount of data that could be compromised.