During a recent incident response case, we were tasked with discovering the point of entry for an attacker that had compromised the entire Windows network. Among other things we uncovered evidence of web application attacks targeting the company's public facing web portal followed by a web shell uploaded onto the portal.
The web shell was subsequently identified as a variant of the PAS PHP web shell. This is the same type of web shell that was also found to have played a key role in the recent APT campaign dubbed Grizzly Steppe, allegedly involving Russian cyber threat actors (see Page 5 of the Grizzly Steppe APT report).
One piece of evidence pointing to this being a variant of the PAS web shell was found in a packet trace. The trace captured the communication between the shell and the attacker's external IP address after the attacker invoked the TCP reverse_connect shell feature of the PAS web shell.
While PHP web shell toolkits are a fairly common means of getting persistent access to a compromised web portal, we believe this particular variant to be significant. Unlike typical web shells that use encoding and/or obfuscation techniques to evade detection and make the code difficult to analyze, this web shell uses an uncommon form of encryption of its PHP code to thwart attempts to gain access to and/or analyze the web shell's capabilities.
The PAS web shell
The PAS web shell is in the category of full-featured PHP web shells that are used by attackers after initial exploitation in order to maintain persistent access to a compromised web portal.
There are many variants of the PAS web shell in the wild and its features and capabilities are similar to other more commonly found web shells like WSO/FilesMan.
While the different variants are similar in their features and capabilities they often differ in the type and level of code obfuscation techniques they employ to evade AV detection and impede malware analysis by either automated tools or human analysts.
- Most variants use either easily reversible encoding (e.g. base64) or a combination of compression (gzdeflate) and encoding.
- Others use custom obfuscation techniques and the routines to deobfuscate. These run the PHP web shell code are included in the web shell PHP file itself.
- Although it is rare, some utilize a passphrase that is needed to gain entry to the shell (often provided via a Cookie or a HTTP POST request). In such cases the MD5 of the passphrase is used to confirm correct entry. MD5 is a hash that is relatively easy to brute force and recover the clear text passphrase. This makes this method less resilient to preventing manual analysis. This method however has no bearing on the sort of encoding or obfuscation used. It only serves to authenticate access to the web shell.
- Even more rare are variants that use more advanced methods for authentication. Such is the case with the variant that is the target of our analysis today.
Below are some of the screenshots of a PAS PHP web shell:
How did we discover it?
The initial hint that the web portal was probably compromised arrived in the form of a packet trace that was shared by the affected party. The packet trace shows what appears to a communication/interaction between the web portal and an external IP. The figure below shows the reconstituted TCP stream showing this interaction between the web portal and the external IP.
Immediately noteworthy was the string "Hello from P.A.S. BackConnect". Our preliminary analysis of the trace also suggested that the web portal was making a TCP reverse shell connection to an external IP address. A PHP web shell known as PAS is known to establish this type of backdoor. PAS is a known, although not exceedingly common, type of PHP web shell used as a post exploitation tool on compromised websites.
Now that we knew we were likely dealing with PAS our next task was to locate the web shell. We used a combination of automated web shell detection programs and manual examination methods to locate the web shell but initial efforts were unsuccessful.
Typically, PHP web shells contain strings that are easy to key off of and use to search across all files inside the webroot for the presence of a web shell. However, in this case this was not possible because the code contained significant amount of binary data, with very few ascii (human readable) strings.
However, persistence paid off and we were finally able to locate the web shell which, in the end, was named quite simply pas.php.
A special shoutout to my manager (and contributor to this blog), Thanassis Diogos, who was able to detect the shell on the portal by searching for the following text strings that were in human readable form:
A key source of different versions of the PAS web shell is: hxxps://github.com/wordfence/grizzly
which archives samples of PAS web shell versions and variants.
By comparing the web shell code found on the compromised portal with the variants available on github, we realized that our version of the PAS shell was unique. In fact it differed from the known variants in the way it handled authentication and in the manner it "encrypted' rather than encoded (or obfuscated) PHP code to evade detection and make analysis challenging.
Analysis of the PAS web shell code
Let's look at the PAS web shell we found:
The code shown in the Figures above is not the entire code, merely snippets from the beginning and end of the code for brevity.
It is clear that the code is a PHP script in which the author defines a variable J__WP (highlighted in Green and with two underscores between the j and WP characters) to store some binary data (highlighted in Yellow).
This binary data is in fact the encrypted web shell code, while the rest of the PHP script is designed to acquire an external "secret" via a HTTP POST request from the attacker. The script then uses this secret to decrypt the binary data and run the extracted PHP web shell code on the compromised portal.
So let's take a look at this php code and try to simplify and clean it up to make it easier to understand.
First, Let's substitute the three confusing variable names, j_WP (with a single underscore), j__WP (with two underscores), and j___WP (with three underscores), with the variable names SECRET, BIN, and LOOPCTR respectively in order to make the code easier to read. The selection of these new variable names is not random and is intended to convey the type of data the variables contain or the type of role they play in the code's execution.
Next let's replace the binary content stored in the variable BIN with the string "ENCRYPTED_BIN_CODE" much like a place holder. This will make the code cleaner to read in an editor. We will finally "beautify" the code to make it easier to read and analyze as shown below.
Now we are in a position to better understand how this variant of PAS web shell works.
The code above stores the binary content inside a variable named BIN. It then stores the data received via the HTTP POST request in a variable named SECRET. If no data is received via the POST request, then it checks if the cookie is set and initializes the value of the SECRET variable accordingly. If neither data from POST request is received nor the cookie is set, then it sets the SECRET variable to value NULL.
If the SECRET variable is populated (e.g. it is not NULL) then the script will decrypt the binary code using the following logic:
The above logic may be explained as follows:
- Calculate the length of the SECRET variable (number of characters in the passphrase provided via the HTTP POST request or as set in the Cookie). Let's call this l.
- Reverse the passphrase string and calculate its MD5 hash
- Generate the substring of the MD5 hash (starting at the beginning of the hash value and with length of the substring = l + 1)
- Concatenate the MD5 hash of the passphrase to this (sub)string generated in Step 3 above.
- Update the value of the SECRET variable from its original value (which was the passphrase obtained via the POST request) with the string computed in Step 4.
- Decrypt the binary content using the formula highlighted in Yellow in the code snippet above inside a loop that executes from LOOPCTR=0 to LOOPCTR=15587 (i.e. 15588 times).
- For each time the loop executes, update the value of the SECRET variable by concatenating the value of $BIN[$LOOPCTR] to its previous value, computed in the Step highlighted in Yellow. For example, the first time the loop executes, the value of the SECRET variable is updated to the value of SECRET concatenated with value computed for $BIN ($LOOPCTR = 0 first time the loop executes) and more generally, value of the SECRET variable during the ith execution of the loop will be the value of the SECRET variable after the (i-1)th execution of the loop concatenated with the value of $BIN[i].
- If the correct passphrase was supplied by the attacker, the new value of BIN will be gzipped compressed version of the PHP web shell code. In this case gzinflate operation will result in "interpretable" PHP code that the PHP interpreter on the compromised portal can execute, giving access to the web shell to the attacker.
- If the wrong passphrase was supplied by the attacker, the new value of the BIN variable after execution of the loop will not be a valid gzipped compressed PHP code and the operation to gzinflate this code will fail.
- In this way, the malware author has ensured that the PHP code will execute only if the correct secret is known, thus defeating both AV detection and any attempts at analysis.
The PHP code can also be visually represented as follows:
Summary of Results
The following are the results of our findings:
- The secret or the passphrase supplied by the attacker either via a HTTP POST request or via a Cookie plays a pivotal role in decrypting the encrypted binary code stored in the variable j__WP in this variant of the PAS web shell.
- There are no restrictions on the size of the passphrase or the character space for the passphrase. We would have to resort to some type of dictionary, brute-force, or a hybrid attack to discover the SECRET that will successfully decrypt the code. Even if brute-force method is used, we would have to make assumptions about both the maximum length and the character space. If we assume the attackers use a long maximum length and/or include a complex character space (i.e. maximum length = 16 and charspace = A..Z, a..z, 0..9 and special characters) then this would make brute force very difficult.
- It becomes virtually impossible to decrypt the web shell code and analyze its features and capabilities without the passphrase.
- This approach to authentication and code encryption is more likely to be used in the case of exploit kits and other advanced malware but is very rare in the case of web shells. Most web shells use either simple base64 encoding (which is easily reversible) or custom obfuscation (in which the functions/routines to deobfuscate and run the PHP code are available in the PHP web shell code itself). They also generally employ MD5 hash comparison to validate if the passphrase entered is correct and the passphrase does not play any role in deobfuscating the PHP code.
Thanks to Thanassis Diogos for his contribution to the research described in this post.