Throughout Trustwave SpiderLabs' many forensicinvestigations, we often stumble upon malicious samples that have been'packed'. This technique/concept can be unfamiliar tothe aspiring malware reverser or digital forensic investigator, so I thought itwould be fun to use this opportunity to talk about portable executable (PE) packersat a high level. If you already know what PE packers are and how they work,you're more than welcome to continue reading, however it's certainly possibleyou may not learn something new. Think of this as a 101 blog post.
So what are PE packers? How do they work? How can you defeatthem? I'm going to do my best to answer these questions.
In essence, packers are tools that are used to compress a PEfile. This primarily allows the person running the tool to reduce the size ofthe file. As an added benefit, since the file is compressed, it will alsotypically thwart many reverse engineers from analyzing the code statically(without running it). Packers were originally created for legitimate purposes,to decrease the overall file size. Later on, malware authors realized thebenefits previously mentioned and began to utilize them as well. Of course,just compressing a PE file won't really work on its own. If you try to run it,the file will complain, and fail, and you'll end up having a horrible day.
In order to address this, packers will wrap this compressed datainside a working PE file structure. When run, the packed file will decompressthe legitimate PE file data into memory, and execute it.
Before we get into the specifics of how this takes place ata technical level, it's important to understand some of the fundamentals of howa PE file is structured.
PE File Structure
Whether you know it or not, it's a near certainty thatyou've encountered PE files before. If you've used Microsoft Windows, you'veused PE files. It's a file format developed by Microsoft that is used on alarge number of file types. The most common ones that you're likely torecognize are .exe and .dll files (executables and dynamic-linked librariesrespectively). PE files are made of two parts—The header and sections. Theheader contains details about the PE file itself, while the sections containthe contents of the executable.
The header is broken up into a number of different parts—theDOS header, the PE header, optional headers, and the sections table. There canbe any number of sections within a PE file, and they can be named anything theauthor would like. However, there must almost always be sections containing theactual code of the file, the import table, and the data used by the code. Itends up looking something like this:
The optional header in particular is of special importancewhen we're talking about (un)packing. Not to diminish the DOS or PE headerimportance, but they simply don't offer anything we're interested in for thescope of this blog post. All you really need to know about those headers isthat they simply contain information about the PE file structure itself.
This header can contain a wealth of information, such as thesize of code on disk, the checksum of the file, the size of the stack and heap,etc. It also contains the address of the entry point, or where code executionwill begin. This becomes important later as the packer modifies this value, andit will fall on us to find the location the original entry point, or OEP, ofthe packed file. Once this is identified, we can dump the memory of theexecutable, reconstruct the imports (mentioned more in a second), and voilà, we're good to go.
Data directories are contained within the optional header. Datadirectories contain tables that contain information about resources, imports,exports, debug information, and local thread storage to name a few. If you'relooking for more in-depth information on the data directories, I highlyrecommend you take a look at this great MSDN article on the topic.The import information in particular is of special interest to us. Before wejump into it, let's talk about imports for a second—
So to explain imports, let's think of a hypotheticalsituation. We have five executables that all share some piece of code. Now, itdoesn't really make a ton of sense for all of these executables to store thiscode by themselves. Instead, it makes a lot more sense to break out this codeinto a separate library and simply have each executable load this library atruntime. This is essentially what the import table is—A list of libraries andtheir associated functions that an executable wishes to load at runtime. Thistable of functions and libraries is replaced in a packed file, and is generatedwhen executed. This means that if we wish to unpack a binary that has beenpacked, we'll have to reconstruct this information in order for the unpacked PEfile to be valid and to work as expected. Once this header is parsed, executionflow moves onto the Sections table.
The Sections Table outlines all sections that are presentwithin the PE file. This includes the name of the section, the location of thesection, size of the section both in the file and in virtual memory, as well asany flags associated with that section. Sectionsmake up the bulk of the PE file itself, so it is important to have this tableof information on hand.
As mentioned earlier, PE sections typically at the veryleast contain a section for code, a section for data, and a section forimports. The import section contains the actual addresses for all functionsneeded by the PE file. These addresses are populated at runtime, as everyWindows system may be different. As such, it's possible that a function may notbe located at the same memory location between Windows versions. By populatingthe import table with these addresses at runtime, it allows us to use this PEfile on multiple Microsoft Windows machines.
Digging Into An Example
Now that we've got a decent grasp of the PE file structure,let's use what we've learned to manually unpack an actual file. For this exampleI'm going to use calc.exe using the packer MPRESS. MPRESSis a popular packer that has been around for a while now. It supports a largenumber of file types and works on all current versions of Microsoft Windows. Sobefore we pack our file, let's take a quick look at what calc.exe looks like inits original state.
Using one of my favorite tools, CFFExplorer,we can easily view various pieces of information about the PE file structure,including, but not limited to various headers, section information, importinformation, and any embedded resource entries. I've specifically shown theOptional Header and the various sections contained in calc.exe in thescreenshots below.
Now let's pack the sample with MPRESS and see how the filehas changed. As you can see in the following screenshots, MPRESS has modifiedthe sections present in the PE file. The ".text" and ".data" sections have beenreplaced with ".MPRESS1" and ".MPRESS2". The ".MPRESS1" section contains thecompressed data of the original calc.exe file, and the ".MPRESS2" sectioncontains a number of functions used to decompress this data and reconstruct theimport table.
You might also notice in the above screenshot that the entrypoint of code execution (AddressOfEntryPoint) has changed. The first step inunpacking this sample manually is to identify the original entry point (OEP).Let's start unpacking this sample dynamically.
For this example, I'm going to switch between IDA Pro andOllyDbg. You're welcome to use whatever tools you wish, but my personal stylejust leads me to switching between these tools often, so I'm going to use themboth here. I find that IDA does a better job of visualizing what is happening,but OllyDbg tends to have more features and better results in a dynamicanalysis environment.
One of the first things we see while debugging the sample isa call to this rather complicated function. This function is in actualitydecompressing the compressed data found in '.MPRESS1'. It uses an interestingtechnique called 'in-place decompression' to accomplish this. That is to say,MPRESS is able to decompress the data without creating a new section of memoryand dumping the decompressed data to it. Instead, it simple overwrites thecompressed data with the decompressed data.
Once decompression completes, we then see code executionpass to a series of loops, which reconstruct the import table. This can be seenbelow where I demonstrate a before and after of the unpacking process:
Eventually, we hit the original entry point of the (at thispoint) unpacked calc.exe. It is here we want to dump the process' memory todisk. For this task, I prefer a plugin for OllyDbg called OllyDump.Not only will this allow you to dump the process, but it also has the abilityto, among other things, rebuild the import table (remember the packing processdestroys the original).
Like many things in this industry, there are many ways toapproach this. If you're not a big fan of OllyDbg, a nice alternative to theplugin I just mentioned is a tool called ChimpRec.The tool's main page has a full list of features, but essentially it allows youdo dump a currently running process and rebuild its imports. I'm sure there aremany others out there that will accomplish the same thing. These two utilitiesare simply my personal favorites.
And at this point that's really all there is to it forunpacking a basic packed binary. I haven't touched on a number of more advancedfunctionality that is present in other packers. Namely, the ability to performanti-reversing techniques. It is my hope to discuss some of these techniques infuture blog posts.