• Stars
    star
    136
  • Rank 267,670 (Top 6 %)
  • Language
    Python
  • License
    GNU Affero Genera...
  • Created over 3 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Greybox Synthesizer geared for deobfuscation of assembly instructions.

Qsynthesis

QSynthesis is a Python3 API to perform I/O based program synthesis of bitvector expressions. It aims at facilitating code deobfuscation. The algorithm is greybox approach combining both a blackbox I/O based synthesis and a whitebox AST search to synthesize sub-expressions (if the root node cannot be synthesized).

This algorithm as originaly been described at the BAR academic workshop:

The code has been release as part of the following Black Hat talk:

Disclaimer: This framework is experimental, and shall only be used for experimentation purposes. It mainly aims at stimulating research in this area.

Documentation

The installation, examples, and API documentation is available on the dedicated documentation: Documentation

Functionalities

The core synthesis is based on Triton symbolic engine on which is built the whole framework. It provides the following functionalities:

  • synthesis of bitvector expressions
  • ability to check through SMT the semantic equivalence of synthesized expressions
  • ability to synthesize constants (if the expression encode a constant)
  • ability to improve oracles (pre-computed tables) overtime through a learning mechanism
  • ability to reassemble synthesized expression back to assembly
  • ability to serve oracles through a REST API to facilitate the synthesis usage
  • an IDA plugin providing an integration of the synthesis

Quick start

Installation

In order to work Triton first has to be installed: install documentation. Triton does not automatically install itself in a virtualenv, copy it in your venv or use --system-site-packages when configuring your venv.

Then:

$ git clone https://github.com/quarkslab/qsynthesis.git
$ cd qsynthesis
$ pip3 install '.[all]'

The [all] will installed all dependencies (see the documentation for a light install).

Table generation

The synthesis algorithm requires generating oracle tables derived from a grammar (a set of variables and operators). Qsynthesis installation provides the utility qsynthesis-table-manager enabling manipulating tables. The following command generate a table with 3 variables of 64 bits, 5 operators using a vector of 16 inputs. We limit the generation to 5 million entries.

$ qsynthesis-table-manager generate -bs 64 --var-num 3 --input-num 16 --random-level 5 --ops AND,NEG,MUL,XOR,NOT --watchdog 80 --limit 5000000 my_oracle_table
Generate Table
Watchdog value: 80.0
Depth 2 (size:3) (Time:0m0.23120s)
Depth 3 (size:21) (Time:0m0.23198s)
Depth 4 (size:574) (Time:0m0.26068s)
Depth 5 (size:400858) (Time:0m21.23231s)
Threshold reached, generation interrupted
Stop required
Depth 5 (size:5000002) (Time:4m52.56009s) [RAM:9.52Gb]

Note: The generation process is RAM consuming the --watchdog enables setting a percentage of the RAM above which the generation is interrupted.

Synthesizing a bitvector expression

We then can try simplifying a seemingly obfuscated expression with:

from qsynthesis import SimpleSymExec, TopDownSynthesizer, InputOutputOracleLevelDB

blob = b'UH\x89\xe5H\x89}\xf8H\x89u\xf0H\x89U\xe8H\x89M\xe0L\x89E\xd8H\x8bE' \
       b'\xe0H\xf7\xd0H\x0bE\xf8H\x89\xc2H\x8bE\xe0H\x01\xd0H\x8dH\x01H\x8b' \
       b'E\xf8H+E\xe8H\x8bU\xe8H\xf7\xd2H\x0bU\xf8H\x01\xd2H)\xd0H\x83\xe8' \
       b'\x02H!\xc1H\x8bE\xe0H\xf7\xd0H\x0bE\xf8H\x89\xc2H\x8bE\xe0H\x01\xd0' \
       b'H\x8dp\x01H\x8bE\xf8H+E\xe8H\x8bU\xe8H\xf7\xd2H\x0bU\xf8H\x01\xd2' \
       b'H)\xd0H\x83\xe8\x02H\t\xf0H)\xc1H\x89\xc8H\x83\xe8\x01]\xc3'

# Perform symbolic execution of the instructions
symexec = SimpleSymExec("x86_64")
symexec.initialize_register('rip', 0x40B160)  # arbitrary address
symexec.initialize_register('rsp', 0x800000)  # arbitrary stack
symexec.execute_blob(blob, 0x40B160)
rax = symexec.get_register_ast("rax")  # retrieve rax register expressions

# Load lookup tables
ltm = InputOutputOracleLevelDB.load("my_oracle_table")

# Perform Synthesis of the expression
synthesizer = TopDownSynthesizer(ltm)
synt_rax, simp = synthesizer.synthesize(rax)

print(f"expression: {rax.pp_str}")
print(f"synthesized expression: {synt_rax.pp_str} [{simp}]")

Limitations

  • synthesis accuracy limited by pre-computed tables exhaustivness
  • table generation limited by RAM consumption
  • reassembly cannot involve memory variable, destination is necessarily a register and architecture depends on llvmlite (thus mostly x86_64)
  • the code references trace-based synthesis which is disabled (as the underlying framework is not yet open-source)

Authors

  • Robin David (@RobinDavid), Quarkslab

Contributors

Huge thanks to contributors to this research:

  • Luigi Coniglio
  • Jonathan Salwan

More Repositories

1

binbloom

Raw binary firmware analysis software
C
493
star
2

kdigger

Kubernetes focused container assessment and context discovery tool for penetration testing
Go
424
star
3

quarkspwdump

Dump various types of Windows credentials without injecting in any process.
C
418
star
4

rewind

Snapshot-based coverage-guided windows kernel fuzzer
Rust
307
star
5

arybo

Manipulation, canonicalization and identification of mixed boolean-arithmetic symbolic expressions
C++
293
star
6

irma

IRMA is an asynchronous & customizable analysis system for suspicious files.
JavaScript
268
star
7

conf-presentations

Quarkslab conference talks
263
star
8

dreamboot

UEFI bootkit
C
230
star
9

binmap

system scanner
C++
216
star
10

legu_unpacker_2019

Scripts to unpack APK protected by Legu
Python
211
star
11

AERoot

AERoot is a command line tool that allows you to give root privileges on-the-fly to any process running on the Android emulator with Google Play flavors AVDs.
Python
195
star
12

android-restriction-bypass

PoC to bypass Android restrictions
C++
194
star
13

peetch

An eBPF playground
Python
184
star
14

titanm

This repository contains the tools we used in our research on the Google Titan M chip
C
181
star
15

qbindiff

Quarkslab Bindiffer but not only !
Python
169
star
16

quokka

Quokka: A Fast and Accurate Binary Exporter
C++
165
star
17

NFLlib

NTT-based Fast Lattice library
C++
165
star
18

pastis

PASTIS: Collaborative Fuzzing Framework
Python
154
star
19

samsung-trustzone-research

Reverse-engineering tools and exploits for Samsung's implementation of TrustZone
Python
143
star
20

pyrrha

A tool for firmware cartography
Python
135
star
21

llvm-passes

Collection of various llvm passes
C++
115
star
22

qb-sync

qb-sync is an open source tool to add some helpful glue between IDA Pro and Windbg. Its core feature is to dynamically synchronize IDA's graph windows with Windbg's position.
C++
115
star
23

starlink-tools

A collection of tools for security research on Starlink's User Terminal
Python
112
star
24

LLDBagility

A tool for debugging macOS virtual machines
C
107
star
25

tritondse

Triton-based DSE library with loading and exploration capabilities (and more!)
Python
102
star
26

sspam

Symbolic Simplification with PAttern Matching
Python
100
star
27

android-fuzzing

C
100
star
28

CVE-2020-0069_poc

C
97
star
29

minik8s-ctf

A beginner-friendly CTF about Kubernetes security.
Shell
74
star
30

QBDL

QuarkslaB Dynamic Linker library
C++
71
star
31

iMITMProtect

Prevent Apple to mess with keys
C
70
star
32

whvp

PoC for a snapshot-based coverage-guided fuzzer targeting Windows kernel components
Rust
67
star
33

mattermost-plugin-e2ee

End-to-end encryption plugin for Mattermost
TypeScript
66
star
34

aosp_dataset

Large Commit Precise Vulnerability Dataset based on AOSP CVE
Python
57
star
35

llvm-dev-meeting-tutorial-2015

Material for an LLVM Tutorial presented at LLVM Dev Meeting 2015
TeX
47
star
36

dxfx

DxFx is a proof-of-concept DJI Pilot unpacker
Python
31
star
37

irma-probe

IRMA probe
25
star
38

irma-frontend

IRMA frontend
25
star
39

irma-ansible-old

IRMA ansible
24
star
40

libleeloo

Library to manage big sets of integers (and IPv4 ranges)
C++
23
star
41

sboot-binwalk

Python
21
star
42

irma-brain

IRMA brain
21
star
43

nodescan

Asynchronous scanning library
C++
19
star
44

pixiefail

PoC for PixieFail vulnerabilities
Python
18
star
45

python-binexport

Python interface for Binexport, the Bindiff export format
Python
14
star
46

numbat

Library to manipulate and create Sourcetrail databases
Python
14
star
47

bgraph

BGraph is a tool designed to generate dependencies graphs from Android.bp soong files.
Python
14
star
48

training_ecu

Hardware and software for the ECU we use during trainings
C++
14
star
49

dataset-call-graph-blogpost-material

12
star
50

idascript

Utilities scripts and Python module to facilitate executing idapython scripts in IDA.
Python
10
star
51

python-bindiff

Python module wrapping Bindiff usage into a Python API.
Python
10
star
52

BVWhiteBox

This PoC illustrates our work on asymmetric white-box cryptography, it can be used to generate a set of lookup tables used for lattice-based white-box scheme
Python
10
star
53

tpmee

Python
9
star
54

nvidia-ngx-wrapper

C
9
star
55

sstic-tame-the-qemu

C
9
star
56

ip_conv_sse

C++
9
star
57

crypto-condor

crypto-condor is a Python library for compliance testing of implementations of cryptographic primitives
C
8
star
58

qsig

QSig: Patch signature generation - detection tool
Python
8
star
59

linksys-wag200G

Some binaries and tools for the Linksys WAG200N router
C
7
star
60

windbg-vtl

JavaScript debugger extension for WinDbg that allows to dump the partitions running on Hyper-V
JavaScript
7
star
61

keyringer

Fork of keyringer from https://keyringer.pw (added some features like tree view, additional checks, ...)
Shell
7
star
62

irma-common

IRMA common
7
star
63

ansible-selenium-server

a Vagrant VM using Ansible to provide a Selenium Server
Shell
7
star
64

irmacl

irma api command line client
Python
6
star
65

land_of_cxx

C++
6
star
66

hooking-golang-playground

Various experiments with golang internals
C
4
star
67

erlang-prism

PRISM is a disassembler for Erlang BEAM virtual machine bytecode
Python
4
star
68

qb.backup

The server-side script of the qb.backup orchestration solution.
Python
4
star
69

wirego

C
4
star
70

wdnis_tool

CMake
3
star
71

diffing-portal

Static site for diffing portal
Jupyter Notebook
3
star
72

ziphyr

On-the-fly zip of streamed file with optional zipcrypto.
Python
2
star
73

python-zipstream

forked from allanlei/python-zipstream
Python
2
star
74

ansible-playbook-qb.backup

An example Ansible playbook deploying the roles qb.backup and qb.backup_server.
1
star
75

irma-web-ui

IRMA Web User Interface
JavaScript
1
star
76

irma-probe-tutorial

1
star
77

irmacl-async

Asynchronous client library for IRMA API
Python
1
star
78

can-workshop

Files for the Grehack 2021 workshop: Revers3 me if you CAN
Python
1
star