• Stars
    star
    216
  • Rank 183,179 (Top 4 %)
  • Language
    Ruby
  • License
    GNU General Publi...
  • Created almost 7 years ago
  • Updated almost 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An analysis of the Warbird virtual-machine protection for the CI!g_pStore

Introduction

Our team recently had the opportunity to meet and exchange with two of the authors of "Rootkits and Bootkits, Reversing Modern Malware and Next Generation Threats" [1], namely Alex Matrosov and Eugene Rodionov, who precisely are in the process of releasing the final edition of their textbook (alongside with Sergey Bratus). Amongst all the knowledge they have packed into their chapters, an old acquaintance reminds me of some researches I did a few years ago on Windows 7 and Windows 8.1: ci.dll.

ci.dll is a cornerstone of Windows security and is involved very early in its boot process. This module, and more specifically driver signature enforcement code, got all the attention for obvious reasons (see [2]). Still, an unsung great piece of software engineering stayed hidden for too long within its meanders: an encrypted memory store (CI!g_pStore), wrapped by heavily obfuscated code using virtual-machine software protection and related to a larger code protection framework: Microsoft Warbird.

The Warbird framework has gone relatively unnoticed for a long time, until Alex Ionescu released "The "Bird" That Killed Arbitrary Code Guard" [3], presented on September 27th 2017 at ekoparty conference in Buenos Aires. At this occasion, it is deeply enlightening to discover how the Warbird framework has evolved into a scary beast protecting from arbitrary code execution.

Many years have passed since our research and now seems a good time to propose the following contributions:

  • An analysis of the virtual-machine protection implemented in ci.dll on Windows 7/8.1
  • A Windbg plugin performing on-the-fly decryption/encryption of the CI!g_pStore

When the kernel whistles

As a reminder, all the technical elements presented below have been studied on Windows 7/8.1. From ci.dll, one can easily locate the specific code by looking for peauthvbn prefixed function names:

  • peauthvbn_InitStore(x,x)
  • peauthvbn_GetDebugCredentialsData(x,x)
  • peauthvbn_GetBootDriversVerificationData(x,x)
  • peauthvbn_StoreParameter(x,x,x,x)
  • peauthvbn_GetParameters(x,x,x)
  • peauthvbn_SetDebugCredentialsData(x,x,x)

These functions are all wrappers around the Warbird virtual machine. They are used to initialize, query and update the CI!g_Store. This store is allocated in PEAuthInitStore and then initialized in peauthvbn_InitStore. Please note that a random value issued from KETICKCOUNT is passed to the peauthvbn_InitStore function.

int __stdcall PEAuthInitStore()
{
  int seed;
  int res;
  PVOID allocated_store;
  MACRO_STATUS errno;

  seed = MEMORY[KETICKCOUNT];
  if ( g_Store )
    return STATUS_INVALID_PARAMETER;
  res = ExInitializeResourceLite(&g_StoreLock);
  if ( res >= 0 )
  {
    g_bStoreLockInitialized = 1;
    KeEnterCriticalRegion();
    ExAcquireResourceExclusiveLite(&g_StoreLock, 1u);
    allocated_store = ExAllocatePoolWithTag(PagedPool, sizeof(store), 'PE');
    g_Store = allocated_store;
    if ( allocated_store )
      errno = peauthvbn_InitStore(allocated_store, seed);
    else
      errno = STATUS_NO_MEMORY;
    ExReleaseResourceLite(&g_StoreLock);
    KeLeaveCriticalRegion();
    res = errno;
  }
  return res;
}
int __cdecl peauthvbn_InitStore(PVOID store_ptr, int seed)
{
  return warbird_vm_internal(
           0xFAF1D7C599A70ADDi64,
           store_ptr,
           seed,
           0xFF9CAA0499D6ACBEi64,
           0xC31C4F59A8645793i64,
           0i64,
           0);
}

peauthvbn_InitStore invokes the virtual machine. The 64-bits values are used to parametrize the execution of the virtual machine, like a keyed execution. This key is stored in the rax native register and updated by each of the 0x800 different handlers. When the updated value of the key is zero, the execution of the virtual machine stops.

The execution loop of the virtual machine look like this:

for ( ctx.field_90 = value; keyed_exec; keyed_exec = WARBIRD_HANDLERS[keyed_exec & 0x7FF](&ctx, keyed_exec) )

Warbird virtual machine

Not much needs to be said about the analysis of the virtual machine. By a happy coincidence, the fifth chapter of the Practical Reverse Engineering textbook contains a methodology that works like a charm to analyze the Warbird virtual machine.

The core idea is simple: we unroll the execution of the virtual machine, using static analysis and symbolic execution. In practice, we first lift the function transfer of the current handler from its assembly code, then we inject some symbolism into it (we know the virtual registers used by the virtual machine are indexed based on rcx) and finally we perform a symbolic execution of the result to step to the next handler and repeat these steps until the execution ends.

The provided scripts are based on the Metasm framework [4], developed by Yoann Guillot. They allow to statically unroll the execution of the virtual machine and log each step.

  • go.rb the launcher, it only passes a context to the symbolic execution engine.
  • msvm64.rb: the core component, implements the handlers analysis and symbolic execution engine.
  • aes.rb: (spoiler alert) re-implementation of the peauthvbn_InitStore sub-program for debug purpose.

Let's illustrate the analysis methodology with the first handler executed when peauthvbn_InitStore is called. To begin with, at assembly level:

PAGE:000000008005DAB8 sub_8005DAB8    proc near
PAGE:000000008005DAB8 arg_0           = qword ptr  8
PAGE:000000008005DAB8
PAGE:000000008005DAB8    mov     [rsp+arg_0], rbx
PAGE:000000008005DABD    mov     rbx, rcx
PAGE:000000008005DAC0    mov     r11, rdx
PAGE:000000008005DAC3    shr     r11, 20h
PAGE:000000008005DAC7    mov     r10d, r11d
PAGE:000000008005DACA    mov     r9d, r11d
PAGE:000000008005DACD    sub     r10d, edx
PAGE:000000008005DAD0    shl     edx, 4
PAGE:000000008005DAD3    shr     r9d, 4
PAGE:000000008005DAD7    add     r9d, edx
PAGE:000000008005DADA    movzx   eax, r10b
PAGE:000000008005DADE    xor     rax, 0A8h
PAGE:000000008005DAE4    lea     edx, [r9+1788E3FCh]
PAGE:000000008005DAEB    mov     r8, [rax+rcx]
PAGE:000000008005DAEF    mov     rax, [rcx+78h]
PAGE:000000008005DAF3    shl     rdx, 20h
PAGE:000000008005DAF7    sub     rax, 28h
PAGE:000000008005DAFB    mov     [rcx+1B8h], rax
PAGE:000000008005DB02    lea     eax, [r11+6Dh]
PAGE:000000008005DB06    movzx   ecx, al
PAGE:000000008005DB09    movzx   eax, r9b
PAGE:000000008005DB0D    xor     ecx, eax
PAGE:000000008005DB0F    movzx   eax, r10b
PAGE:000000008005DB13    xor     ecx, eax
PAGE:000000008005DB15    lea     eax, [r11-5E99BF03h]
PAGE:000000008005DB1C    sub     ecx, 96h
PAGE:000000008005DB22    mov     [rcx+rbx], r8
PAGE:000000008005DB26    mov     rbx, [rsp+arg_0]
PAGE:000000008005DB2B    mov     ecx, r10d
PAGE:000000008005DB2E    xor     rax, rcx
PAGE:000000008005DB31    mov     ecx, 0D18DF8C6h
PAGE:000000008005DB36    xor     rax, rcx
PAGE:000000008005DB39    or      rax, rdx
PAGE:000000008005DB3C    retn
PAGE:000000008005DB3C sub_8005DAB8    endp

The code of the handlers is slightly obfuscated, some transformations are applied on the data flow, however the control flow is clean.

Then we use Metasm to lift the function transfer of this piece of code (thanks to the code_binding method from the disassembler object) and then replace state pointers with symbolic variables:

---------------------------------------
[+] (0) step with vmkey 0xfaf1d7c599a70add
[+] analyzing handler 0x2dd at 0x8005dab8
[+] new handler
[+] eval_binding 8005dab8, finalize:true, complex:false
[+] raw binding

qword ptr [rsp+8] => rbx
qword ptr [rcx+1b8h] => qword ptr [rcx+78h]-28h
qword ptr [(((((((rdx>>20h)&0ffffffffh)+6dh)^((((rdx>>24h)&0fffffffh)+((rdx<<4)&0fffffff0h))^(((rdx>>20h)&0ffffffffh)-(rdx&0ffffffffh))))&0ffh)-96h)&0ffffffffh)+rcx] => qword ptr [(((((rdx>>20h)&0ffffffffh)-(rdx&0ffffffffh))^0a8h)&0ffh)+rcx]
rax => (((((rdx>>20h)&0ffffffffh)+0ffffffffa16640fdh)^((((rdx>>20h)&0ffffffffh)-(rdx&0ffffffffh))^0d18df8c6h))&0ffffffffh)|(((((((rdx>>24h)&0fffffffh)+((rdx<<4)&0fffffff0h))&0ffffffffh)+1788e3fch)<<20h)&0ffffffff00000000h)

[+] raw binding - rdx injected

qword ptr [rcx+1b8h] => qword ptr [rcx+78h]-28h
qword ptr [rcx] => qword ptr [rcx+40h]
rax => 0c1a8af482c9f2cech

[+] symbolic handler binding

ctx_37 => ctx_f-28h
ctx_0 => ctx_8
rax => 0c1a8af482c9f2cech

This handler is fairly simple but others are more complex, with many registers and/or memory assignments. In this example, two virtual registers are assigned (ctx_0 and ctx37) and of course the execution key of the virtual machine is updated (rax).

As we said previously there are 0x800 of them, which most certainly indicates they are generated programmatically. Most of them have a complex semantics, by opposition to virtual machine designed to simply emulate a virtual processor where each handler implements a simple instruction. In the Warbird virtual machine, handlers are like chunks of the virtualized algorithm. Microsoft developing the Warbird framework means they have full control over the source code and the compilation tool-chain, thus we can expect this transformation phase to be applied at compilation time.

Flipping tables

We are now able to progress in the virtual machine code, trace a bunch of handlers. Then, suddenly, everything goes mad. SNAFU: your statically computed virtual context is trash compared to values you can observe using dynamic analysis (debugging ci.dll). This is due to a special attention left for us by the developers of this protected code. Indeed, a few of the handlers (less than 0x20/0x800) call a function that scrambles the context by swapping some of its registers: ctx_scrambler.

The code_binding method used to to lift the function transfer of the handlers doesn't support sub-function calls. That why our analysis goes wrong. We have to split the analysis and implement some extra magic.

Based on the value of the execution key, and some static data (scrambling_info), ctx_scrambler applies permutations on the virtual registers of the virtual machine.

void __cdecl ctx_scrambler(VM_CTX *ctx, unsigned __int8 nb_round, __int64 scrambling_info)
{
  _DWORD *scrambling_info_;
  unsigned int key;
  __int64 round;
  int scrambling_word;

  scrambling_info_ = (_DWORD *)scrambling_info;
  key = nb_round;
  if ( nb_round )
  {
    round = nb_round;
    do
    {
      scrambling_word = key ^ *scrambling_info_;
      ++scrambling_info_;
      key = scrambling_word + (8 * key ^ (key >> 3));
      *(&ctx->field_0 + BYTE1(scrambling_word)) = *(&ctx->field_0 + (unsigned __int8)scrambling_word);
      *(&ctx->field_0 + HIBYTE(scrambling_word)) = *(&ctx->field_0 + BYTE2(scrambling_word));
      --round;
    }
    while ( round );
  }
}

Nothing we can't catch with our scripts, below is an example of scrambling emulation, as it is emulated by the vm_ctx_scramble function:

[+] key 0x37ed4b04
[+] scramble word 0x2f86315
[+] permutations word 0x35152811, ["11", "28", "15", "35"]
[+]  ctx_28 = ctx_11
[+]  ctx_35 = ctx_15
[+] key 0xeead1951
[+] scramble word 0xce8e0478
[+] permutations word 0x20231d29, ["29", "1d", "23", "20"]
[+]  ctx_1d = ctx_29
[+]  ctx_20 = ctx_23
[+] key 0x88e086cb
[+] scramble word 0x8fd6aff9
[+] permutations word 0x7362932, ["32", "29", "36", "7"]
[+]  ctx_29 = ctx_32
[+]  ctx_7 = ctx_36
[+] key 0x5d4e4fb3
[+] scramble word 0x4b457fbb
[+] permutations word 0x160b3008, ["8", "30", "b", "16"]
[+]  ctx_30 = ctx_8
[+]  ctx_16 = ctx_b

Lifting the veil

Our analysis is now rock solid. We've been able to trace and log the execution of all handlers involved in the peauthvbn_InitStore sub-program, to see how the context of the virtual machine evolve, etc. The objective is now to get a higher level understanding of what's going on.

There are a few techniques that can usually be used to break down complex problems into smaller ones: trying to find loops, cycles, patterns, etc. This can be applied successfully here as it looks like the complete trace can be sub-divided into three similar looking chunks of 10 sub-chunks each.

Then, one can note that many handlers share a common pattern: they access a table located in the data of the binary, LITTLE_BIRDS_TABLE, to pick a dword:

---------------------------------------
[+] (7e) step with vmkey 0xa201ea87cb35977c
[+] analyzing handler 0x77c at 0x8003f644
[+] new handler
[+] eval_binding 8003f644, finalize:true, complex:false

[+] enable symbolic semantic
[+] step semantic

ctx_2d => ctx_27
ctx_27 => (dword ptr [LITTLE_BIRDS_TABLE+4*((ctx_3+5eah)&0ffffffffh)]^0b0c9711bh)&0ffffffffh
rax => 0cc006005e3c49cd2h

[+] enable symbolic execution
[+] final binding

ctx_2d => 9fff38h
ctx_27 => 15152a3fh
rax => 0cc006005e3c49cd2h

When finding seemingly looking random values it is often a good idea to try to match them with popular algorithms. Without spoiling all the fun, 0x15152a3f is used in T-Table based implementation of the AES algorithm. With that clue in mind we are close to solve the Warbird riddle.

We said that it is possible to split the complete trace into three similar chunks. Each of these chunks independently updates 0x10 bytes of data (we got this information from the log of the virtual machine). Besides, the 10 (0xA) sub-chunks are actually the rounds of the AES algorithm , this leads us to an AES-128 algorithm.

Then if we perform some additional dynamic tests, it appears that the chunks of 0x10 bytes are encrypted independently. At the end, an educated guess would be an AES-128-ECB algorithm (Electronic Codebook (ECB) encryption mode).

Indeed, one can use its favorite cryptographic library to validate that the CI!g_pStore store (0x30 bytes) is actually encrypted with the AES-128 algorithm used in ECB mode.

Remember the random value passed to peauthvbn_InitStore? It is stored in the CI!g_pStore, our guess is that its purpose is to diversify the ciphertext (due to the use of the ECB mode).

We have the algorithm, now how do we get its key? The AES implementation is actually a white-box implementation, meaning the key is hidden in the implementation itself. For this iteration of Warbird, it is actually a very basic white-boxing: the AES key stream is precomputed and hard-coded within the handlers. One can extract the key from the key stream of the first AES round. See "Practical cracking of white-box implementations" from SysK in Phrack issue 0x44 [5] for a good introduction.

Windbg plugin

We have been able to recover the encryption algorithm and its key, at this point we have all the details we need to inspect the CI!g_pStore on our own. Let's pack all this in a Windbg script and start to dynamically observe the encrypted store of a debugged machine.

Please use the provided source code to build the store.dll. It should then be placed inside the winext directory of you Windbg package.

Then from Windbg, you'll be able to type:

************* Symbol Path validation summary **************
Response                         Time (ms)     Location
Deferred                                       srv*c:\symbols*http://msdl.microsoft.com/download/symbols
Symbol search path is: srv*c:\symbols*http://msdl.microsoft.com/download/symbols
Executable search path is:
Windows 8.1 Kernel Version 9600 MP (1 procs) Free x64
Built by: 9600.17936.amd64fre.winblue_ltsb.150715-0840
Machine Name:
Kernel base = 0xfffff800`ce871000 PsLoadedModuleList = 0xfffff800`ceb467b0
System Uptime: 0 days 0:00:00.964

nt!DbgBreakPointWithStatus:
fffff800`ce9c7590 cc              int     3

0: kd> .load store
[store] DebugExtensionInitialize, ExtensionApis loaded

0: kd> !store_dump
[store] CI!g_pStore pointer address 0xfffff8018c158058
[store] CI!g_pStore  0xffffc00102c09640
[store] local Store  0x00007ff981115660
[store] DisplayStore :
  > 00007ff981115660 - 0xf70853f7 0x97bbb404 0xd184fa8d 0xabf7dc7b
  > 00007ff981115670 - 0x2fd8b115 0xb76d6193 0x97a903ee 0xa6f229c0
  > 00007ff981115680 - 0xb95087b5 0xa50f6868 0xe4d47778 0xfbb05b87
[store] DecryptStore
[store] DisplayStore :
  > 00007ff981115660 - 0x000003aa 0x00000000 0x00000001 0x00000000
  > 00007ff981115670 - 0x000003a9 0x00000000 0x00000003 0x00000000
  > 00007ff981115680 - 0x0000005b 0x00000000 0x00000000 0x00000000

Now let's say we want to modify some of the values stored in the store:

0: kd> !store_setdw
[store] !store_setdw <Index> <Value>

0: kd> !store_setdw 1 8
[store] CI!g_pStore pointer address 0xfffff8018c158058
[store] CI!g_pStore  0xffffc00102c09640
[store] local Store  0x00007ff981115660
[store] DecryptStore
[store] current store:
[store] DisplayStore :
  > 00007ff981115660 - 0x000003aa 0x00000000 0x00000001 0x00000000
  > 00007ff981115670 - 0x000003a9 0x00000000 0x00000003 0x00000000
  > 00007ff981115680 - 0x0000005b 0x00000000 0x00000000 0x00000000
[store] new store:
[store] DisplayStore :
  > 00007ff981115660 - 0x000003aa 0x00000008 0x00000001 0x00000000
  > 00007ff981115670 - 0x000003a9 0x00000000 0x00000003 0x00000000
  > 00007ff981115680 - 0x0000005b 0x00000000 0x00000000 0x00000000
[store] EncryptStrore
[store] CI!g_pStore pointer address 0xfffff8018c158058
[store] CI!g_pStore  0xffffc00102c09640
[store] local Store  0x00007ff981115660

Conclusion

The tool-set proposed here is provided "as is", to possibly serve as a background for future researches. The Warbird virtual machine analysis script was developed a long time ago; it could be really interesting to see how more recent frameworks like Miasm [6] or Triton [7] could help, especially as they have both implemented a Dynamic Symbolic Execution (DSE) engine.

Few years ago, the simple idea of having an obfuscated, virtual machine based, white-box AES in the Windows kernel was somehow unexpected to me. Analyzing this beautifully crafted piece of software engineering has been deeply inspiring. Besides it is just the tip of the iceberg; for example, more could be said about its links with protected processes.

Since then, Alex Ionescu has demonstrated that Microsoft engineers have pushed the Warbird framework to a totally new level in the most recent version of Windows; broader in terms of scope, features and more complex than ever. Kudos to them for that; for the rest of us it means that new challenges arise.

To conclude, thank you Alex & Eugene for the time we shared, we're looking forward to the final edition!

A big thank you to the Airbus Digital Security team for their insightful reviews and comments.

License

The msvm tool and Store Windbg plugin are released under the [GPLv2].

[1]https://www.nostarch.com/rootkits
[2]http://j00ru.vexillium.org/?p=377
[3]https://www.ekoparty.org/charla.php?id=722
[4]https://github.com/jjyg/metasm
[5]http://phrack.org/issues/68/8.html#article
[6]https://github.com/cea-sec/miasm
[7]https://github.com/JonathanSalwan/Triton
[GPLv2]https://github.com/airbus-seclab/warbirdvm/blob/master/COPYING

More Repositories

1

bincat

Binary code static analyser, with IDA integration. Performs value and taint analysis, type reconstruction, use-after-free and double-free detection
OCaml
1,662
star
2

qemu_blog

A series of posts about QEMU internals:
1,345
star
3

cpu_rec

Recognize cpu instructions in an arbitrary binary file
Python
640
star
4

ilo4_toolbox

Toolbox for HPE iLO4 & iLO5 analysis
Python
412
star
5

diffware

An extensively configurable tool providing a summary of the changes between two files or directories, ignoring all the fluff you don't care about.
Python
196
star
6

gustave

GUSTAVE is a fuzzing platform for embedded OS kernels. It is based on QEMU and AFL (and all of its forkserver siblings). It allows to fuzz OS kernels like simple applications.
Python
194
star
7

powersap

Powershell SAP assessment tool
PowerShell
187
star
8

crashos

A tool dedicated to the research of vulnerabilities in hypervisors by creating unusual system configurations.
C
182
star
9

c-compiler-security

Security-related flags and options for C compilers
Python
179
star
10

ramooflax

a bare metal (type 1) VMM (hypervisor) with a python remote control API
C
178
star
11

bta

Open source Active Directory security audit framework.
Python
131
star
12

android_emuroot

Android_Emuroot is a Python script that allows granting root privileges on the fly to shells running on Android virtual machines that use google-provided emulator images called Google API Playstore, to help reverse engineers to go deeper into their investigations.
Python
121
star
13

AutoResolv

Python
71
star
14

elfesteem

ELF/PE/Mach-O parsing library
Python
50
star
15

GEA1_break

Implementation of the key recovery attack against GEA-1 keys (Eurocrypt 2021)
C
47
star
16

airbus-seclab.github.io

Conferences, tools, papers, etc.
43
star
17

AFLplusplus-blogpost

Blogpost about optimizing binary-only fuzzing with AFL++
Shell
34
star
18

nbutools

Tools for offensive security of NetBackup infrastructures
Python
30
star
19

rebus

REbus facilitates the coupling of existing tools that perform specific tasks, where one's output will be used as the input of others.
Python
25
star
20

usbq_core

USB man in the middle linux kernel driver
C
19
star
21

AppVsWild

application process protection hypervisor virtualization encryption
9
star
22

gunpack

Generic unpacker (dynamic)
C
8
star
23

usbq_userland

User land program to be used with usbq_core
Python
8
star
24

ramooflax_scripts

ramooflax python scripts
Python
6
star
25

cpu_doc

Curated set of documents about CPU
3
star
26

c2newspeak

C
3
star
27

rebus_demo

REbus demo agents
Python
2
star
28

security-advisories

2
star
29

pwnvasive

semi-automatic discovery and lateralization
Python
1
star
30

pok

forked from pok-kernel/pok
C
1
star
31

afl

Airbus seclab fork of AFL
C
1
star