• Stars
    star
    735
  • Rank 61,652 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

flare-emu

flare-emu marries a supported binary analysis framework, such as IDA Pro or Radare2, with Unicornโ€™s emulation framework to provide the user with an easy to use and flexible interface for scripting emulation tasks. It is designed to handle all the housekeeping of setting up a flexible and robust emulator for its supported architectures so that you can focus on solving your code analysis problems. Currently, flare-emu supports the x86, x86_64, ARM, and ARM64 architectures.

It currently provides five different interfaces to serve your emulation needs, along with a slew of related helper and utility functions.

  1. emulateRange โ€“ This API is used to emulate a range of instructions, or a function, within a user-specified context. It provides options for user-defined hooks for both individual instructions and for when โ€œcallโ€ instructions are encountered. The user can decide whether the emulator will skip over, or call into function calls. This interface provides an easy way for the user to specify values for given registers and stack arguments. If a bytestring is specified, it is written to the emulatorโ€™s memory and the pointer is written to the register or stack variable. After emulation, the user can make use of flare-emuโ€™s utility functions to read data from the emulated memory or registers, or use the Unicorn emulation object that is returned for direct probing. A small wrapper function for emulateRange, named emulateSelection, can be used to emulate the range of instructions currently highlighted in IDA Pro.

  2. iterate - This API is used to force emulation down specific branches within a function in order to reach a given target. The user can specify a list of target addresses, or the address of a function from which a list of cross-references to the function is used as the targets, along with a callback for when a target is reached. The targets will be reached, regardless of conditions during emulation that may have caused different branches to be taken. Like the emulateRange API, options for user-defined hooks for both individual instructions and for when โ€œcallโ€ instructions are encountered are provided. An example use of the iterate API is to achieve something similar to what our argtracker tool does.

  3. iterateAllPaths - This API is much like iterate, except that instead of providing a target address or addresses, you provide a target function that it will attempt to find all paths through and emulate. This is useful when you are performing code analysis that wants to reach every basic block of a function.

  4. emulateBytes โ€“ This API provides a way to simply emulate a blob of extraneous shellcode. The provided bytes are not added to the IDB and are simply emulated as is. This can be useful for preparing the emulation environment. For example, flare-emu itself uses this API to manipulate a Model Specific Register (MSR) for the ARM64 CPU that is not exposed by Unicorn in order to enable Vector Floating Point (VFP) instructions and register access. The Unicorn emulation object is returned for further probing by the user.

  5. emulateFrom - This API is useful in cases where function boundaries are not clearly defined as is often the case with obfuscated binaries or shellcode. You provide a starting address, and it will emulate until there is nothing left to emulate or you stop emulation in one of your hooks. With IDA Pro, this can be called with the strict parameter set to False to enable dynamic code discovery; flare-emu will have IDA Pro make instructions as they are encountered during emulation.

Installation

To install flare-emu for IDA Pro, simply drop flare_emu.py, flare_emu_ida.py, and flare_emu_hooks.py into your IDA Pro's python directory and import it as a module in your IDAPython scripts.

To install flare-emu for Radare2, simply ensure that flare_emu.py, flare_emu_radare.py, and flare_emu_hooks.py are in Python's search path for importing modules. When using Radare2 as the binary analysis component for flare-emu, r2pipe is required.

In any case, flare-emu depends on Unicorn and its Python bindings.

IMPORTANT NOTE
flare-emu was written using the new IDA Pro 7x API, it is not backwards compatible with previous versions of IDA Pro.

Usage

While flare-emu can be used to solve many different code analysis problems, one of its more common uses is to aid in decrypting strings in malware binaries. FLOSS is a great tool than can often do this automatically for you by attempting to identify the string decrypting function(s) and using emulation to decrypt the strings passed in at every cross-reference to it. However, it is not possible for FLOSS to always be able to identify these functions and emulate them properly using its generic approaches. Sometimes you have to do a little more work, and this is where flare-emu can save you a lot of time once you are comfortable with it. Let's walk through a common scenario a malware analyst encounters when dealing with encrypted strings.

Easy String Decryption Scenario with IDA Pro

You've identified the function to decrypt all the strings in an x86_64 binary. This function is called all over the place and decrypts many different strings. In IDA Pro, you name this function decryptString. Here is your flare-emu script to decrypt all these strings and place comments with the decrypted strings at each function call as well as logging each decrypted string and the address it is decrypted at.

from __future__ import print_function
import flare_emu

def decrypt(argv):
    myEH = flare_emu.EmuHelper()
    myEH.emulateRange(myEH.analysisHelper.getNameAddr("decryptString"), registers = {"arg1":argv[0], "arg2":argv[1], 
                           "arg3":argv[2], "arg4":argv[3]})
    return myEH.getEmuString(argv[0])
    
def iterateCallback(eh, address, argv, userData):
    s = decrypt(argv)
    print("%s: %s" % (eh.hexString(address), s))
    eh.analysisHelper.setComment(address, s, False)
    
if __name__ == '__main__':   
    eh = flare_emu.EmuHelper()
    eh.iterate(eh.analysisHelper.getNameAddr("decryptString"), iterateCallback)

In __main__, we begin by creating an instance of the EmuHelper class from flare-emu. This is the class we use to do everything with flare-emu. Next, we use the iterate API, giving it the address of our decryptString function and the name of our callback function that EmuHelper will call for each cross-reference emulated up to.

The iterateCallback function receives the EmuHelper instance, named eh here, along with the address of the cross-reference, the arguments passed to this particular call, and a special dictionary named userData here. userData is not used in this simple example, but think of it as a persistent context to your emulator where you can store your own custom data. Be careful though, because flare-emu itself also uses this dictionary to store critical information it needs to perform its tasks. One such piece of data is the EmuHelper instance itself, stored in the "EmuHelper" key. If you are interested, search the source code to learn more about this dictionary. This callback function simply calls the decrypt function, prints the decrypted string and creates a comment for it at the address of that call to decryptString.

decrypt creates a second instance of EmuHelper that is used to emulate the decryptString function itself, which will decrypt the string for us. The prototype of this decryptString function is as follows: char * decryptString(char *text, int textLength, char *key, int keyLength). It simply decrypts the string in place. Our decrypt function passes in the arguments as received by the iterateCallback function to our call to EmuHelper's emulateRange API. Since this is an x86_64 binary, the calling convention uses registers to pass arguments and not the stack. flare-emu automatically determines which registers represent which arguments based on the architecture and file format of the binary as determined by IDA Pro, allowing you to write at least somewhat architecture agnostic code. If this were 32-bit x86, you would use the stack argument to pass the arguments instead, like so: myEH.emulateRange(myEH.analysisHelper.getNameAddr("decryptString"), stack = [0, argv[0], argv[1], argv[2], argv[3]]). The first stack value is the return address in x86, so we just use 0 as a placeholder value here. Once emulation is complete, we call the getEmuString API to retrieve the null-terminated string stored in the memory location pointed to by the first argument passed to the function.

Easy String Decryption Scenario with Radare2

Using the same example above, not much changes when working with Radare2 rather than IDA Pro. One difference is that flare-emu is currently designed to be run as a command-line script or within a Python shell when working with Radare2. The Python shell is great for ad-hoc problem solving while the command-line script is great for batch processing. The Radare2 version of the script above looks like this:

from __future__ import print_function
import flare_emu

def decrypt(argv, eh):
    myEH = flare_emu.EmuHelper(samplePath=sys.argv[1], emuHelper=eh)
    myEH.emulateRange(myEH.analysisHelper.getNameAddr("decryptString"), registers = {"arg1":argv[0], "arg2":argv[1], 
                           "arg3":argv[2], "arg4":argv[3]})
    return myEH.getEmuString(argv[0])
    
def iterateCallback(eh, address, argv, userData):
    s = decrypt(argv, eh)
    print("%s: %s" % (eh.hexString(address), s))
    eh.analysisHelper.setComment(address, s, False)
    
if __name__ == '__main__':   
    eh = flare_emu.EmuHelper(samplePath=sys.argv[1])
    eh.analysisHelper.setName(<some address>, "decryptString")
    eh.iterate(eh.analysisHelper.getNameAddr("decryptString"), iterateCallback)

There are two differences with this script. First, the EmuHelper constructor takes a parameter here: samplePath=sys.argv[1]. When the samplePath parameter is provided, flare-emu will use Radare2 with r2pipe as its binary analysis engine. You can also see that a second parameter is passed to the second EmuHelper instance created in the decrypt function. The emuHelper parameter takes an existing EmuHelper object and clones its memory when creating the new object. Also, if you are using Radare2, the new instance re-uses the existing Radare2 session instead of creating a new one which would add more overhead. Second, flare-emu creates a new instance of Radare2 using r2pipe.open, so it will likely not have the name decryptString for the function we are interested in. You can either set the name yourself using EmuHelper's analysisHelper object like so: eh.analysisHelper.setName(<some address>, "decryptString"), or you can directly input the address for the calls to iterate and emulateRange.

Emulation Functions

emulateRange(startAddr, endAddr=None, registers=None, stack=None, instructionHook=None, callHook=None, memAccessHook=None, hookData=None, skipCalls=True, hookApis=True, strict=True, count=0) - Emulates the range of instructions starting at startAddress and ending at endAddress, not including the instruction at endAddress. If endAddress is None, emulation stops when a "return" type instruction is encountered within the same function that emulation began.

  • registers is a dictionary with keys being register names and values being register values. Some special register names are created by flare-emu and can be used here, such as arg1, arg2, etc., ret, and pc.

  • stack is an array of values to be pushed on the stack in reverse order, much like arguments to a function in x86 are. In x86, remember to account for the first value in this array being used as the return address for a function call and not the first argument of the function. flare-emu will initialize the emulated thread's context and memory according to the values specified in the registers and stack arguments. If a string is specified for any of these values, it will be written to a location in memory and a pointer to that memory will be written to the specified register or stack location instead.

  • instructionHook can be a function you define to be called before each instruction is emulated. It has the following prototype: instructionHook(unicornObject, address, instructionSize, userData).

  • callHook can be a function you define to be called whenever a "call" type instruction is encountered during emulation. It has the following prototype: callHook(address, arguments, functionName, userData).

  • hookData is a dictionary containing user-defined data to be made available to your hook functions. It is a means to persist data throughout the emulation. flare-emu also uses this dictionary for its own purposes, so care must be taken not to define a key already defined. This variable is often named userData in user-defined hook functions due to its naming in Unicorn.

  • skipCalls will cause the emulator to skip over "call" type instructions and adjust the stack accordingly, defaults to True.

  • hookApis causes flare-emu to perform a naive implementation of some of the more common runtime and OS library functions it encounters during emulation. This frees you from having to be concerned about calls to functions such as memcpy, strcat, malloc, etc., and defaults to True.

  • memAccessHook can be a function you define to be called whenever memory is accessed for reading or writing. It has the following prototype: memAccessHook(unicornObject, accessType, memAccessAddress, memAccessSize, memValue, userData).

  • strict, when set to True (default), checks branch destinations to ensure the disassembler expects instructions. It otherwise skips the branch instruction. If set to False when using IDA Pro, flare-emu will make instructions in IDA Pro as it emulates them (DISABLE WITH CAUTION).

  • count is the maximum number of instructions to emulate, defaults to 0 which means no limit.

iterate(target, targetCallback, preEmuCallback=None, callHook=None, instructionHook=None, hookData=None, resetEmuMem=False, hookApis=True, memAccessHook=None) - For each target specified by target, a separate emulation is performed from the beginning of the containing function up to the target address. Emulation will be forced down the branches necessary to reach each target. target can be the address of a function, in which case the target list is populated with all the cross-references to the specified function. Or, target can be an explicit list of targets.

  • targetCallback is a function you create that will be called by flare-emu for each target that is reached during emulation. It has the following prototype: targetHook(emuHelper, address, arguments, userData).

  • preEmuCallback is a function you create that will be called before emulation for each target begins. You can implement some setup code here if needed.

  • resetEmuMem will cause flare-emu to reset the emulation memory before emulation of each target begins, defaults to False.

iterateAllPaths(target, targetCallback, preEmuCallback=None, callHook=None, instructionHook=None, hookData=None, resetEmuMem=False, hookApis=True, memAccessHook=None, maxPaths=MAXCODEPATHS, maxNodes=MAXNODESEARCH) - For the function containing the address target, a separate emulation is performed for each discovered path through it, up to maxPaths.

  • maxPaths - the max number of paths through the function that will be searched for and emulated. Some of the more complex functions can cause the graph search function to take a very long time or never finish; tweak this parameter to meet your needs in a reasonable amount of time.

  • maxNodes - the max number of basic blocks that will be searched when finding paths through the target function. This is a safety measure to prevent unreasonable search times and hangs and likely does not need to be changed.

emulateBytes(bytes, registers=None, stack=None, baseAddress=0x400000, instructionHook=None, hookData=None) - Writes the code contained in bytes to emulation memory at baseAddress if possible and emulates the instructions from the beginning to the end of bytes.

emulateFrom(startAddr, registers=None, stack=None, instructionHook=None, callHook=None, memAccessHook=None, hookData=None, skipCalls=True, hookApis=True, strict=True, count=0) - This API is useful in cases where function boundaries are not clearly defined as is often the case with obfuscated binaries or shellcode. You provide a starting address as startAddr, and it will emulate until there is nothing left to emulate or you stop emulation in one of your hooks. This can be called with the strict parameter set to False to enable dynamic code discovery; flare-emu will have IDA Pro make instructions as they are encountered during emulation.

Utility Functions

The following is an incomplete list of some of the useful utility functions provided by the EmuHelper class.

  • hexString(value) - Returns a hexadecimal formatted string for the value. Useful for logging and print statements.

  • skipInstruction(userData, useAnalysisHelper=False) - Call this from an emulation hook to skip the current instruction, moving the program counter to the next instruction. useAnalysisHelper option was added to handle cases where the binary analysis framework folds multiple instructions into one pseudo instruction and you would like to skip all of them. This function cannot be called multiple times from a single instruction hook to skip multiple instructions. To skip multiple instructions, it is recommended not to write to the program counter directly if you are emulating ARM code as this might cause problems with thumb mode. Instead, try EmuHelper's changeProgramCounter API (described below).

  • changeProgramCounter(userData, newAddress) - Call this from an emulation hook to change the value of the program counter register. This API takes care of thumb mode tracking for the ARM architecture.

  • getRegVal(registerName) - Retrieves the value of the specified register, being sensitive to sub-register addressing. For example, "ax" will return the lower 16 bits of the EAX/RAX register in x86.

  • stopEmulation(userData) - Call this from an emulation hook to stop emulation. Use this instead of calling the emu_stop Unicorn API so that the EmuHelper object can handle bookkeeping related to the iterate feature.

  • getEmuString(address) - Returns the string of characters located at an address in the emulated memory, up to a null terminator. Characters are not necessarily printable.

  • getEmuWideString(address) - Returns the string of "wide characters" located at an address in the emulated memory, up to a null terminator. "Wide characters" is meant loosely here to refer to any series of bytes containing a null byte every other byte, as would be the case for an ASCII string encoded in UTF-16 LE. Characters are not necessarily printable.

  • getEmuBytes(address, length) - Returns a string of bytes located at an address in the emulated memory.

  • getEmuPtr(address) - Returns the pointer value located at the given address.

  • writeEmuPtr(address, value) - Writes the pointer value at the given address in the emulated memory.

  • loadBytes(bytes, address=None) - Allocates memory in the emulator and writes the bytes to it.

  • isValidEmuPtr(address) - Returns True if the provided address points to valid emulated memory.

  • getEmuMemRegion(address) - Returns a tuple containing the start and end address of memory region containing the provided address, or None if the address is not valid.

  • getArgv() - Call this from an emulation hook at a "call" type instruction to receive an array of the arguments to the function.

  • addApiHook(apiName, hook) - Adds a new API hook for this instance of EmuHelper. Whenever a call instruction to apiName is encountered during emulation, EmuHelper will call the function specified by hook. If hook is a string, it is expected to be the name of an API already hooked by EmuHelper, in which case it will call its existing hook function. If hook is a function, it will call that function.

  • allocEmuMem(size, addr=None) - Allocates enough emulator memory to contain size bytes. It attempts to honor the requested address, but if it overlaps with an existing memory region, it will allocate in an unused memory region and return the new address. If address is not page-aligned, it will return an address that keeps the same page-aligned offset within the new region. For example, requesting address 0x1234 when 0x1000 is already allocated, may have it allocate at 0x2000 and return 0x2234 instead.

Learn More

To learn more about flare-emu, please read our introductory blog at https://www.fireeye.com/blog/threat-research/2018/12/automating-objective-c-code-analysis-with-emulation.html.

More Repositories

1

commando-vm

Complete Mandiant Offensive VM (Commando VM), a fully customizable Windows-based pentesting virtual machine distribution. [email protected]
PowerShell
6,897
star
2

flare-vm

A collection of software installations scripts for Windows systems that allows you to easily setup and maintain a reverse engineering environment on a VM.
PowerShell
6,334
star
3

capa

The FLARE team's open-source tool to identify capabilities in executable files.
Python
4,775
star
4

flare-floss

FLARE Obfuscated String Solver - Automatically extract obfuscated strings from malware.
Python
3,155
star
5

red_team_tool_countermeasures

YARA
2,639
star
6

flare-ida

IDA Pro utilities from FLARE team
Python
2,031
star
7

flare-fakenet-ng

FakeNet-NG - Next Generation Dynamic Network Analysis Tool
Python
1,677
star
8

speakeasy

Windows kernel and user mode emulation.
Python
1,290
star
9

SharPersist

C#
1,213
star
10

ThreatPursuit-VM

Threat Pursuit Virtual Machine (VM): A fully customizable, open-sourced Windows-based distribution focused on threat intelligence analysis and hunting designed for intel and malware analysts as well as threat hunters to get up and running quickly.
PowerShell
1,204
star
11

gocrack

GoCrack is a management frontend for password cracking tools written in Go
Go
1,101
star
12

stringsifter

A machine learning tool that ranks strings based on their relevance for malware analysis.
Python
672
star
13

SilkETW

C#
641
star
14

Mandiant-Azure-AD-Investigator

PowerShell
614
star
15

Azure_Workshop

HCL
572
star
16

sunburst_countermeasures

YARA
561
star
17

Ghidrathon

The FLARE team's open-source extension to add Python 3 scripting to Ghidra.
Java
556
star
18

capa-rules

Standard collection of rules for capa: the tool for enumerating the capabilities of programs
528
star
19

ReelPhish

Python
493
star
20

iocs

FireEye Publicly Shared Indicators of Compromise (IOCs)
458
star
21

DueDLLigence

C#
450
star
22

FIDL

A sane API for IDA Pro's decompiler. Useful for malware RE and vulnerability research
Python
431
star
23

flare-wmi

C++
412
star
24

GoReSym

Go symbol recovery tool
Go
379
star
25

rvmi

rVMI - A New Paradigm For Full System Analysis
C
352
star
26

PwnAuth

Python
347
star
27

idawasm

IDA Pro loader and processor modules for WebAssembly
Python
332
star
28

ADFSpoof

Python
318
star
29

SimplifyGraph

IDA Pro plugin to assist with complex graphs
C++
303
star
30

STrace

A DTrace on Windows Reimplementation
C++
299
star
31

ShimCacheParser

Python
258
star
32

OfficePurge

C#
256
star
33

msi-search

C
215
star
34

macos-UnifiedLogs

Rust
200
star
35

ioc_writer

Python
195
star
36

GeoLogonalyzer

GeoLogonalyzer is a utility to analyze remote access logs for anomalies such as travel feasibility and data center sources.
Python
194
star
37

Vulnerability-Disclosures

C++
183
star
38

flare-kscldr

FLARE Kernel Shellcode Loader
C
175
star
39

flare-qdb

Command-line and Python debugger for instrumenting and modifying native software behavior on Windows and Linux.
Python
161
star
40

flare-dbg

flare-dbg is a project meant to aid malware reverse engineers in rapidly developing debugger scripts.
Python
149
star
41

thiri-notebook

The Threat Hunting In Rapid Iterations (THIRI) Jupyter notebook is designed as a research aide to let you rapidly prototype threat hunting rules.
Python
146
star
42

route-sixty-sink

Link sources to sinks in C# applications.
C#
137
star
43

VM-Packages

Chocolatey packages supporting the analysis environment projects FLARE-VM & Commando VM.
PowerShell
135
star
44

heyserial

Programmatically create hunting rules for deserialization exploitation with multiple keywords, gadget chains, object types, encodings, and rule types
YARA
130
star
45

dncil

The FLARE team's open-source library to disassemble Common Intermediate Language (CIL) instructions.
Python
124
star
46

flashmingo

Automatic analysis of SWF files based on some heuristics. Extensible via plugins.
Python
118
star
47

Reversing

111
star
48

ioc-scanner-CVE-2019-19781

Indicator of Compromise Scanner for CVE-2019-19781
Shell
91
star
49

flare-bytecode_graph

Python
82
star
50

gocrack-ui

The User Interface for GoCrack
Vue
81
star
51

Volatility-Plugins

Python
80
star
52

unicorn-libemu-shim

libemu shim layer and win32 environment for Unicorn Engine
C++
70
star
53

citrix-ioc-scanner-cve-2023-3519

Shell
61
star
54

AuditParser

AuditParser
Python
56
star
55

remote_lookup

Resolves DLL API entrypoints for a process w/ remote query capabilities.
Visual Basic
54
star
56

synfulknock

Lua
48
star
57

SSSDKCMExtractor

Python
46
star
58

jitm

JITM is an automated tool to bypass the JIT Hooking protection on a .NET sample.
C++
43
star
59

goauditparser

Go
39
star
60

capa-testfiles

Data to test capa's code and rules.
Max
39
star
61

tf_rl_tutorial

Tutorial: Statistical Relational Learning with Google TensorFlow
Jupyter Notebook
39
star
62

macOS-tools

Python
38
star
63

apooxml

Generate YARA rules for OOXML documents.
Python
38
star
64

gootloader

Collection of scripts used to deobfuscate GOOTLOADER malware samples.
Python
36
star
65

pycommands

PyCommand Scripts for Immunity Debugger
Python
35
star
66

vocab_scraper

Vocabulary Scraper script used in FLARE's analysis of Russian-language Carbanak source code
Python
35
star
67

ARDvark

ARDvark parses the Apple Remote Desktop (ARD) files to pull out application usage, user activity, and filesystem listings.
Python
34
star
68

rvmi-rekall

Rekall Forensics and Incident Response Framework with rVMI extensions
Python
32
star
69

gocat

Provides access to libhashcat
Go
29
star
70

ics_mem_collect

Python
26
star
71

rvmi-qemu

QEMU with rVMI extensions
C
26
star
72

IDA_Pro_VoiceAttack_profile

Python
25
star
73

win10_auto

Python
23
star
74

pulsesecure_exploitation_countermeasures

YARA
23
star
75

rvmi-kvm

Linux-KVM with rVMI extensions
C
23
star
76

pivy-report

Poison Ivy Appendix/Extras
17
star
77

siglib

Python
15
star
78

DFUR-Splunk-App

The "DFUR" Splunk application and data that was presented at the 2020 SANS DFIR Summit.
13
star
79

vbScript_deobfuscator

Help deobfuscate VBScript
VBA
13
star
80

flare-gsoc-2023

Supporting resources and documentation for FLARE @ Google Summer of Code 2023
13
star
81

rpdebug_qnx

Python
11
star
82

mandiant_managed_hunting

Azure Deployment Templates for Mandiant Managed Huning
9
star
83

flare-floss-testfiles

Resources for testing FLOSS by the FLARE team.
C
6
star
84

shelidate

Go
2
star