• This repository has been archived on 11/Apr/2024
  • Stars
    star
    136
  • Rank 267,670 (Top 6 %)
  • Language
    C++
  • License
    MIT License
  • Created over 3 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tool aimed to provide a binary analysis of different file formats through the use of an Intermmediate Representation.

KUNAI-static-analyzer

Tool aimed to provide a binary analysis of different file formats through the use of an Intermmediate Representation.

Installation

Old way (not recommended)

Kunai has a couple of dependencies, one of them can be downloaded in the external folder, to do that just run the next command:

https://github.com/Fare9/KUNAI-static-analyzer.git
cd KUNAI-static-analyzer
git submodule update --init --rebase --recursive

With this, you will download the zip external dependency that can be compiled and installed next. The library libspdlog-dev will be also installed for logging in its version 1:1.5.0-1, for doing that we can use the script you'll find in the root of the folder with the next command, the script right now is only for debian based distributions, you can install the dependencies in the next way:

./debian_distros_installer.sh dependencies

This step is also run if you want to build Kunai using the same script with the next command:

./debian_distros_installer.sh build

Or if you want to build using a debug configuration:

./debian_distros_installer.sh debug

In case you want to install dependencies, compile Kunai and finally install it, do it with the next command:

./debian_distros_installer.sh install

CMake (recommended)

In order to improve the experience of compilation and installation, we have moved to CMake, so in order to build Kunai you can follow the next commands, the commands must be run from the root folder of the project:

cmake -B build -S .

Now we build the libraries and all the other binaries from Kunai, we can specify with -j the number of processes to run in parallel so the compilation goes faster:

cmake --build build -j<number of processes>

Finally for the installation we can run the next command, for installing the library in the system the command must run as super user:

sudo cmake --build build/ --target install

Project Structure

The next files/folders are included in the project:

  • doc/: doxygen documentation automatically generated when files are merged into main branch, you shouldn't touch nothing here (or at least is not expected).
  • example-scripts/: KUNAI is intended to be a library for analyst writing simple C++ programs for analyzing DEX (and in the future other file formats) files, here there will be pretty easy examples that allows you to quickly get in touch with KUNAI library.
  • external/: all the libraries that this project depends on. This could be included as a dependency for installing into system, or as an external project in here.
    • external/zip/: very simple library for unziping files, this is used for extracting all the files from an apk file, you need to retrieve it with make.sh or cloning repo with submodules.
  • projects/: stand-alone binaries that are compiled together with KUNAI that are intended for testing or as tools that comes with KUNAI, even when these can be checked for learning how to use KUNAI, some headers or dependencies can be included in other ways than those in example-scripts.
  • src/: source code of KUNAI, here are all the C++ classes structured in folders of KUNAI project.
    • src/APK/: code to manage APK files, an APK is a zip file which contains all the necessary files for running an application.
    • src/DEX/: code of KUNAI's DEX analysis.
    • src/DEX/parser/: code with DEX structure for file parsing, related structures are joined in the same file with different classes.
    • src/DEX/DVM/: utilities for the Dalvik machine, this contains opcodes, instructions, a linear sweep disassembler, and other constants from the DVM.
    • src/DEX/Analysis/: classes that contains utilities for the analyst, these includes managing all the DEX files as just one, it contains the disassembler, it allows managing classes, methods, fields and strings.
    • src/Utils/: utilities used by KUNAI.
    • src/mjolnIR/: code of KUNAI's Intermmediate Representation.
    • src/mjolnIR/Analysis/: classes written for applying different types of analysis to the Intermmediate Representation, some of them can be useful in deobfuscation.
    • src/mjolnIR/Lifters/: code to lift the disassembled instructions to KUNAI's Intermmediate Representation instructions.
    • src/includes/KUNAI/: all KUNAI's headers structured following previous convention.
    • src/includes/KUNAI/Exceptions/: all the custom exceptions used in KUNAI.

DEX

DEX Analysis Architecture

KUNAI offers several classes for the analysis of DEX files, this part of KUNAI's framework is based in Androguards code, but it follows another working flow, the DEX analysis is divided in three main parts:

  • DEX Parser
  • DEX Disassembler
  • DEX Analysis

Instead of retrieving all these fields with one call to a function, we divided them so you can get only with want you want to work, so each one gives different utilities:

DEX Parser

If you get an object from KUNAI::DEX::DEX class, you will get a parsed DEX object, so the only basic work that KUNAI does for you is to parse DEX headers, you have two ways for getting KUNAI::DEX::DEX object, this can be a std::unique_ptr or a std::shared_ptr depending on the function used to get one:

std::unique_ptr<DEX> get_unique_dex_object(std::ifstream &input_file, std::uint64_t file_size)
std::shared_ptr<DEX> get_shared_dex_object(std::ifstream &input_file, std::uint64_t file_size)

For the moment this calls from the API receives a std::ifstream from the file and a std::uint64_t with the file_size.

Once we have received the object, we can check if parsing was correct or not, in order to continue working with the object.

...
auto dex = KUNAI::DEX::get_unique_dex_object(dex_file, fsize);

if (!dex->get_parsing_correct())
{
    std::cerr << "[-] Parsing of DEX file was not correct." << std::endl;
    return 1;
}

Now if we know that parsing is correct, we can access to the parser object from the class KUNAI::DEX::DexParser, from this object, other different objects could be accessed:

  • DexHeader
  • DexStrings
  • DexTypes
  • DexProtos
  • DexFields
  • DexMethods
  • DexClasses
...
auto dex = KUNAI::DEX::get_unique_dex_object(dex_file, fsize);
...
auto dex_parser = dex->get_parser();
...
auto dex_header = dex_parser->get_header();
auto dex_strings = dex_parser->get_strings();
auto dex_types = dex_parser->get_types();
auto dex_protos = dex_parser->get_protos();
auto dex_fields = dex_parser->get_fields();
auto dex_methods = dex_parser->get_methods();
auto dex_classes = dex_parser->get_classes();

DEX Disassembler

A DEX Linear sweep disassembler is implemented following the code from Androguard, this can also be retrieved from KUNAI::DEX::DEX object, if you retrieve the disassembler you obtain an object of the class KUNAI::DEX::DexDisassembler, but the code is not disassembled yet, to do that we will have to call the next function:

...
auto dex = KUNAI::DEX::get_unique_dex_object(dex_file, fsize);
...
auto dex_disassembler = dex->get_dex_disassembler();
dex_disassembler->disassembly_analysis();

With this we would have disassembled the DEX instructions for each method, but we should apply check to know that disassembler was correct!

...
auto dex = KUNAI::DEX::get_unique_dex_object(dex_file, fsize);
...
auto dex_disassembler = dex->get_dex_disassembler();
dex_disassembler->disassembly_analysis();

if (!dex_disassembler->get_disassembly_correct())
{
    std::cerr << "[-] Disassembly was incorrect, cannot show instructions." << std::endl;
    return 1;
}

Now we can access a complex structure to access to the disassembled instructions with the next function:

std::map<std::tuple<std::shared_ptr<ClassDef>, std::shared_ptr<EncodedMethod>>,std::map<std::uint64_t, std::shared_ptr<Instruction>>> get_instructions();

MjolnIR

MjolnIR Architecture

IR written for generic analysis of different file formats and binary architectures, this IR allows us to represent binaries in a higher language, then create Control-Flow graphs, Data-Flow graphs, etc. MjolnIR is intended to support by default three-address code and Abstract-Syntax Tree (AST).

IRStmnt

At the top of the IR we have the statements, these will be every instruction from the IR that could be executed by the program, between these IRStmnt are the expressions (explained later), but more specific statements are those that change the Control-Flow Graph from the function/method, these are conditional and unconditional jumps, return statements, etc.

IRStmnt     -->     IRUJmp   |
                    IRCJmp   |
                    IRRet    |
                    IRBlock  |
                    IRNop    |
                    IRSwitch |
                    IRExpr 

IRUJmp      -->     jmp addr
IRCJmp      -->     if (IRStmnt) { jmp addr } NEXT fallthrough_addr
IRRet       -->     Ret IRStmnt
IRBlock     -->     IRStmnt1, IRStmnt2, ..., IRStmntN

IRExpr

The IR involves to support various instructions from the code, these are what we call IRExpr, these kind of instructions don't modify the control flow of the code but apply different kind of operations to the variables/registers/memory in the program.

IRExpr    -->   IRBinOp   |
                IRUnaryOp | 
                IRAssign  |
                IRPhi     |
                IRCall    |
                IRLoad    |
                IRStore   |
                IRZComp   |
                IRBComp   |
                IRNew     |
                IRAlloca  |
                IRType    
                
IRBinOp   -->   IRExpr <- IRExpr bin_op_t IRExpr
IRUnaryOp -->   IRExpr <- unary_op_t IRExpr
IRAssign  -->   IRExpr <- IRExpr
IRPhi     -->   IRExpr <- IRExpr, IRExpr, ..., IRExpr
IRCall    -->   IRExpr(IRExpr1, IRExpr2, ..., IRExprN)
IRLoad    -->   IRExpr <- *IRExpr
IRStore   -->   *IRExpr <- IRExpr
IRZComp   -->   IRExpr zero_comp_t
IRBComp   -->   IRExpr comp_t IRExpr
IRNew     -->   IRExpr <- new IRExpr
IRAlloca  -->   IRExpr <- new IRExpr[IRExpr]

# kind of operations
bin_op_t  -->   ADD_OP_T   |
                SUB_OP_T   |
                S_MUL_OP_T |
                U_MUL_OP_T |
                S_DIV_OP_T |
                U_DIV_OP_T |
                MOD_OP_T
unary_op_t   -->   INC_OP_T    |
                   DEC_OP_T    |
                   NOT_OP_T    |
                   NEG_OP_T    |
                   CAST_OP_T   |
                   Z_EXT_OP_T  |
                   S_EXT_OP_T
zero_comp_t  -->   EQUAL_ZERO_T |
                   NOT_EQUAL_ZERO_T
comp_t       -->   EQUAL_T              |
                   NOT_EQUAL_T          |
                   GREATER_T            |
                   GREATER_EQUAL_T      |
                   LOWER_T              |
                   ABOVE_T              |
                   ABOVE_EQUAL_T        |
                   BELOW_T                           

IRType

For supporting the types we find in the binary code, we have written a serie of classes which derive from a super class named IRType, as derived classes we have: registers, constant values, strings, memory, callee types.

IRType   -->   IRReg |
               IRTempReg |
               IRConstInt |
               IRMemory |
               IRString |
               IRCallee |
               IRClass  |
               IRFundamental |
               NONE

IRFundamental   --> F_BOOLEAN   |
                    F_BYTE      |
                    F_CHAR      |
                    F_DOUBLE    |
                    F_FLOAT     |
                    F_INT       |
                    F_LONG      |
                    F_SHORT     |
                    F_VOID

Implementations on the IR

Kunai's IR implements each function analyzed as an IRGraph, this graph contains all the blocks as nodes, and the connections between blocks as edges. Recently an implementation of an SSA form for the IRGraph has been implemented, but this is not as stable as the previous one. In the next images you can see the differences between both:

IRGraph no SSA

And the SSA version:

IRGraph SSA

Dalvik

Dalvik has been the first "assembly" language to be supported with MjolnIR inside of KUNAI, you can check all the opcodes and its translation in Dalvik Opcodes

References

Here there are some references used as ideas and base for the design and/or implementation of KUNAI and MjolnIR:

Compiled in

KUNAI has been compiled in next systems:

  • Arch-Linux x86-64 - g++ 11.1
  • elementary OS 6.1 Jólnir x86_64 - g++ 9.3.0
  • elementary OS 6.1 Jólnir x86_64 - clang version 10.0.0-4ubuntu1
  • elementary OS 6.1 Jólnir x86_64 - clang version 15.0.6

License

This project is licensed under a MIT License, please check the file LICENSE to find more information about it.

Contact

For any question, or issues about the tool you can always refer to the issues from the project, or write us to our gmail address: kunai(dot)static(dot)analysis(at)gmail(dot)com

Cite this project

@MISC {KUNAI,
  author       = "Eduardo Blazquez",
  title        = "Kunai-Static-Analyzer",
  howpublished = "https://github.com/Fare9/KUNAI-static-analyzer",
  month        = "may",
  year         = "2021"
}

======= Tool aimed to provide a binary analysis of different file formats through the use of an Intermmediate Representation.

More Repositories

1

Windows-Internals

My repository to upload drivers from different books and all the information related to windows internals.
Assembly
152
star
2

ANBU

ANBU (Automatic New Binary Unpacker) a tool for me to learn about PIN and about algorithms for generic unpacking.
C++
88
star
3

PBA_Analysis_Tools

My own versions from the programs of the book "Practical Binary Analysis"
C++
51
star
4

Shuriken-Analyzer

Repository for a library focused on binary analysis (mainly for Java related bytecodes)
C++
34
star
5

MR_Robot_mimikatz

Arduino's version of Rubber ducky project watched in mr robot (based on https://www.hak5.org/blog/15-second-password-hack-mr-robot-style)
C
33
star
6

AndroidSwissKnife

Framework for static and dynamic analysis in Android APKs
Python
32
star
7

Genaytyk-VM

My notes about Genyatyk VM crackme
C++
27
star
8

RicardoNarvajaCppExercise

Solution for Ricardo Narvaja's C++ Exploiting Exercise
Python
13
star
9

PyWerkzeug-Debug-Command-Execution

python exploit for werkzeug debug shell command execution
Python
10
star
10

OSINTretasNoHayPastebines

Trabajo de OSINT para búsqueda de patrones en pastebin.
Python
10
star
11

My-Symbolic-Execution

My repository to store all the examples I do about symbolic execution
Python
9
star
12

WhatsappCrasher

Python
8
star
13

Exploits

Set of exploiting exercises, exploits for some software, whatever I do related to that topic
Shell
6
star
14

ELF_Infection

Examples of ELF infection techniques both on disk and in memory
C
6
star
15

Obfuscated-Shell

Little reverse shell obfuscated for training (Used MASM32)
Assembly
6
star
16

Elf_Dumper

Experimental project to dump an ELF File from running one
C
6
star
17

my-frida-scripts

A collection of my stupid and some useful frida scripts
JavaScript
6
star
18

RicardoNarvajaCppExercise2

Second exercise of Ricardo Narvaja C++ exploit exercise serie
Python
5
star
19

SomeVirusesTechniques

Just a repository to upload some malware techniques programmed.
C++
5
star
20

JS_POWMET

JS_POWMET Malware Analysis, Javascript, Analysis descriptions and more... =)
JavaScript
5
star
21

CLS-Challenge-ekoparty-2018

Exploit for the 2018 ekoparty's challenge uploaded by Ricardo Narvaja to CLS groups
C++
3
star
22

Virus_Disinfector_KIT

Virus disinfector kit, it works with the library lief, so you will need it to use it
Python
3
star
23

3CTftpSvc-Exploit

My own version of 3CTftpSvc exploit for Ricardo Narvaja exercises
Python
2
star
24

Fare9

2
star
25

RicardoNarvaja-Ejercicios1

Ejercicios creados por Ricardo Narvaja de explotación local en x64
Python
2
star
26

PatternPetater

When I saw that I haven't got pattern_create nor pattern_offset in Kali 2.0 I decided to do my own version in one python script
Python
2
star
27

CFG-challenge

C++
1
star
28

ResumenMPI

Apuntes y Resumen de MPI para AREINCO
C
1
star
29

IdentificationAndAuthenticationSmartCardAssignment

Smart Card Assignment from the UC3M Cybersecurity Master Identification & Authentication Class
C++
1
star
30

Kunai-paper-project

Tools and data used for the paper about Kunai Static Analysis Tool
Python
1
star
31

YourMYOBSupply_Order

Analysis of new Ursnif variant which employs Malicious TLS Callback to achieve Process Injection
JavaScript
1
star