• Stars
    star
    452
  • Rank 96,121 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 7 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Pure Python parser and analyzer for IDA Pro database files (.idb).

Python IDB

python-idb

python-idb is a library for accessing the contents of IDA Pro databases (.idb files). It provides read-only access to internal structures such as the B-tree (ID0 section), name address index (NAM section), flags index (ID2 section), and types (TIL section). The library also provides analysis of B-tree entries to expose logical structures like functions, cross references, bytes, and disassembly (via Capstone). An example use for python-idb might be to run IDA scripts in a pure-Python environment.

Willem Hengeveld (mailto:[email protected]) provided the initial research into the low-level structures in his projects pyidbutil and idbutil. Willem deserves substantial credit for reversing the .idb file format and publishing his results online. This project heavily borrows from his knowledge, though there is little code overlap.

example use:

example: list function names

In this example, we list the effective addresses and names of functions:

In [4]: import idb
   ...: with idb.from_file('./data/kernel32/kernel32.idb') as db:
   ...:     api = idb.IDAPython(db)
   ...:     for ea in api.idautils.Functions():
   ...:         print('%x: %s' % (ea, api.idc.GetFunctionName(ea)))

Out [4]: 68901010: GetStartupInfoA
   ....: 689011df: Sleep
   ....: 68901200: MulDiv
   ....: 68901320: SwitchToFiber
   ....: 6890142c: GetTickCount
   ....: 6890143a: ReleaseMutex
   ....: 68901445: WaitForSingleObject
   ....: 68901450: GetCurrentThreadId
        ...

Note that we create an emulated instance of the IDAPython scripting interface, and use this to invoke idc and idautils routines to fetch data.

example: run an existing IDAPython script

In this example, we run the yara_fn.py IDAPython script to generate a YARA rule for the function at effective address 0x68901695 in kernel32.idb:

asciicast

The target script yara_fn.py has only been slightly modified:

  • to make it Python 3.x compatible, and
  • to use the modern IDAPython modules, such as ida_bytes.GetManyBytes rather than idc.GetManyBytes.

what works

  • ~250 unit tests that demonstrate functionality including file format, B-tree, analysis, and idaapi features.
  • read-only parsing of .idb and .i64 files from IDA Pro v5.0 to v7.5
    • extraction of file sections
    • B-tree lookups and queries (ID0 section)
    • flag enumeration (ID1 section)
    • named address listing (NAM section)
    • types parsing (TIL section)
  • analysis of artifacts that reconstructs logical elements, including:
    • root metadata
    • loader metadata
    • entry points
    • functions
    • structures
    • cross references
    • fixups
    • segments
  • partial implementation of the IDAPython API, including:
    • Names
    • Heads
    • Segs
    • GetMnem (via Capstone)
    • Functions
    • FlowChart (basic blocks)
    • lots and lots of flags
  • Python 2.7 & 3.x compatibility
  • zlib-packed idb/i64 files

what will never work

  • write access

getting started

python-idb is a pure-Python library, with the exception of Capstone (required only when calling disassembly APIs). You can install it via pip or setup.py install, both of which should handle depedency resolution:

 $ cd ~/Downloads/python-idb/
 $ python setup.py install
 $ python scripts/run_ida_script.py  ~/tools/yara_fn.py  ~/Downloads/kernel32.idb
   ... profit! ...

While most python-idb function have meaningful docstrings, there is not yet a comprehensive documentation website. However, the unit tests demonstrate functionality that you'll probably find useful.

Someone interested in learning the file format and contributing to the project should review the idb.fileformat module & tests. Those that are looking to extract meaningful information from existing .idb files probably should look at the idb.analysis and idb.idapython modules & tests.

Please report issues or feature requests through Github's bug tracker associated with the project.

license

python-idb is licensed under the Apache License, Version 2.0. This means it is freely available for use and modification in a personal and professional capacity.

More Repositories

1

python-evtx

Pure Python parser for Windows Event Log files (.evtx)
Python
703
star
2

python-registry

Pure Python parser for Windows Registry hives.
Python
427
star
3

INDXParse

Tool suite for inspecting NTFS artifacts.
Python
210
star
4

EVTXtract

EVTXtract recovers and reconstructs fragments of EVTX log files from raw binary data, including unallocated space and memory images.
Python
176
star
5

shellbags

Cross-platform, open-source shellbag parser
Python
148
star
6

process-forest

Reconstruct process trees from event logs
Python
144
star
7

idawilli

IDA Pro resources, scripts, and configurations
Python
112
star
8

python-sdb

Pure Python parser for Application Compatibility Shim Databases (.sdb files)
Python
104
star
9

lancelot

intel x86(-64) code analysis library that reconstructs control flow
Rust
94
star
10

python-ntfs

Open source Python library for NTFS analysis
Python
79
star
11

ida-netnode

Humane API for storing and accessing persistent data in IDA Pro databases
Python
74
star
12

govt

Virustotal API for Go
Go
64
star
13

python-dotnet-binaryformat

Pure Python parser for data encoded by .NET's BinaryFormatter
Python
46
star
14

python-evt

Pure Python parser for classic Windows Event Log files (.evt)
Python
46
star
15

go-reversing

Resources for reverse engineering Go binaries
Python
41
star
16

python-vb

analysis of visual basic code
Python
39
star
17

LfLe

Recover event log entries from an image by heurisitically looking for record structures.
Python
26
star
18

viv-utils

Utilities for working with vivisect
Python
20
star
19

ida-settings

Fetch and set configuration values from IDAPython scripts
Python
20
star
20

wevt_template

extract and parse WEVT_TEMPLATEs from PE files
Rust
17
star
21

Autopsy-WindowsRegistryContentViewer

no longer maintained
Java
16
star
22

reversing-clj

messing around writing reversing tools in clojure
JavaScript
14
star
23

Autopsy-WindowsRegistryIngestModule

no longer maintained
Java
13
star
24

siglib

function identification signatures
Python
10
star
25

python-pyqt5-hexview

PyQt5 hex viewer widget.
Python
10
star
26

ucutils

Convenience routines for working with the Unicorn emulator in Python
Python
9
star
27

python-pyqt5-vstructui

PyQt5 vstruct hex viewer widget.
Python
9
star
28

Rejistry

Pure Java parser for Windows Registry hive files.
Java
8
star
29

vivisect-vstruct

standalone copy of vstruct from vivisect
Python
5
star
30

dotfiles

Local configuration files for various Linux tools
Shell
5
star
31

williballenthin.com

Source for my personal website
JavaScript
2
star
32

zydis-wasm

example project with zydis targetting wasm
Rust
2
star
33

cfg-ui

experiments in user interfaces around control flow graphs
JavaScript
2
star
34

highlighter-minor-mode

An Emacs minor mode for log analysis.
Emacs Lisp
1
star