• Stars
    star
    215
  • Rank 183,925 (Top 4 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PyTest

Polypyus

Polypyus Firmware Historian

Polypyus learns to locate functions in raw binaries by extracting known functions from similar binaries. Thus, it is a firmware historian. Polypyus works without disassembling these binaries, which is an advantage for binaries that are complex to disassemble and where common tools miss functions. In addition, the binary-only approach makes it very fast and run within a few seconds. However, this approach requires the binaries to be for the same architecture and have similar compiler options.

Polypyus integrates into the workflow of existing tools like Ghidra, IDA, BinDiff, and Diaphora. For example, it can import previously annotated functions and learn from these, and also export found functions to be imported into IDA. Since Polypyus uses rather strict thresholds, it only found correct matches in our experiments. While this leads to fewer results than in existing tools, it is a good entry point for loading these matches into IDA to improve its auto analysis results and then run BinDiff on top.

What Polypyus solves

When working on raw firmware binaries, namely various Broadcom and Cypress Bluetooth firmware versions, we found that IDA auto analysis often identified function starts incorrectly. In IDA Pro 6.8 the auto analysis is a bit more aggressive, leading to more results but also more false positives. Overall, IDA Pro 7.2 was more pessimistic, but missed a lot of functions. This led to only a few BinDiff matches between our firmwares in IDA Pro 6.8 and no useful matches at all in IDA Pro 7.2.

Interestingly, BinDiff often failed to identify functions that, except from branches, were byte-identical. Note that Polypyus searches exactly for these byte-identical functions. We assume that BinDiff fails at these functions due to a different call graph produced by missing functions and false positives. Sometimes, these functions were already recognized by IDA, but often, IDA did either not recognize these as code or not mark them as function. Note that Diaphora has similar problems, as it exports functions identified by IDA before further processing them. The following shows a benchmark on the CYW20735B1 Bluetooth firmware binary that compares various disassemblers and how the disassembler failures result in follow-up diffing issues.

Benchmark comparing IDA Pro, Ghidra, Binary Ninja, radare2, BinDiff and Diaphora

Moreover, while we found that Amnesia finds many functions, it also finds many false positives. However, many functions have a similar stack frame setup in the beginning. Thus, Polypyus has an option to learn common function starts from the annotated input binaries and apply this to other binaries to identify functions without matching their name. This optional step is only applied to the regions in which no functions were previously located, this way the common function starts method and the main function finding do not conflict.

Since these matchers work on the raw binary, they do not depend on a disassembler. This also has one important drawback: if there were different compiler options or a different target architecture, Polypyus will not detect similar functions. Moreover, while the identified matches are very reliable, the function start identification is a bit less reliable, so use the latter with care. In the following, you can see that the Cypress evaluation kits are very similar to each other, but the MacBook firmware is very different.

Benchmark on four different firmwares

How it works

Polypyus creates fuzzy binary matchers by comparing common functions in a collection of annotated firmware binaries.

Currently, the following annotations are supported:

  • A WICED Studio patch.elf file, which is a special ELF file containing only symbol definitions.
  • A .symdefs file as it is produced by most ARM compilers.
  • A .csv file with a format documented in the firmware folder.

These annotations contain the address, size, and name of known functions. The more commonalities the input binaries in the history collection have, the better for Polypyus performance and results. Given several slightly different functions, Polypyus creates very good matchers.

How to install it

Polypyus requires Python 3 >= 3.6. We advise the use of a virtualenv for the following installation. Clone this repository and in this folder run:

pip install .

How to run it

After the installation the following commands are available:

  • polypyus-gui
  • polypyus-cli

Using Polypyus

Polypyus is available through a graphical and a command-line interface. Both, the GUI polypyus-gui and the CLI polypyus-cli, take these arguments during invocation:

  --verbose is the verbosity level. By default, it shows warnings -v shows info -vv show debug information.
  --project sets the location of the project file. This is either a file path or ":memory:".
  --help    Show help message.

The project option facilitates you to store your work for different contexts in different files and also reopen them again.

Using the GUI

The general GUI workflow goes from the left-hand side of the window to the right. First, binaries are added to the history. Then, symbol annotations to the entries in the history follow. Afterward, target binaries can be added. For the matching, hit Create matchers from history. Once the matchers are created, single targets can be selected, or all targets can be matched by selecting batch match. Finally, the findings can be exported to a .csv file.

In the following you can see a demo video where Polypyus only takes a few seconds to learn from two input binaries, annotate them, create matchers, and apply matches to a new binary.

GUI Video

Using the CLI

The upside to using the CLI is its ability to be automated. As of now, the output format of the CLI is subject to change. However, here is an example of calling it:

polypyus-cli --history firmware/history/20819-A1.bin --annotation firmware/history/20819-A1_patch.elf --history firmware/history/20735B1.bin --annotation firmware/history/20735B1_patch.elf --project test.sqlite
polypyus-cli --target firmware/history/20739B1.bin --project test.sqlite

The first command creates test.sqlite as a new project file and imports 20819-A1.bin and 20735B1.bin with their respective patch.elf files. The second invocation reuses the same project file and matches against the binary 20739B1.bin. For each command, the number of --history and --annotation needs to match. These two commands could also be combined into one by adding the --target argument to the first command.

How does it work internally?

A paper that explains the internals was published at the Workshop on Binary Analysis Research (BAR) 2021 with the title Polypyus - The Firmware Historian. Some more details are also contained in Jan's Master thesis final presentation, which covers the issues encountered when working with conventional binary diffing approaches in ARM Thumb2 mode, and how the alternate binary-only approach works.

Additional PDOM type information export and import

The leaked symbols in the patch.elf or .symdefs format only contain function and global variable names. However, there are also a couple .pdom Eclipse project files in WICED Studio 6.2 and 6.4. These contain additional type information. Eclipse uses them internally for auto completion, function search, etc., and we can utilize them in reverse to add type information. Since .pdom files only contain partial, cached information, it can be helpful to combine multiple of them.

In a first step, we export .pdom type information into an SQLite database. The export takes a while, but it can even be aborted and continued later on. Export works as follows:

java -jar pdom/export/export.jar -P BCM20739-B0.1462220149391.pdom

The PDOM import searches for function names in an IDA database, looks them up in the PDOM to search for type information, and then applies that type information into the IDA database. Thus, the IDA database needs to contain correct function names in advance. In principle, these can be created with the import_export scripts from Polypyus. However, the somewhat more advanced scripts that support PDOM import can also handle patch.elf sections. Run the importer as follows:

  • Open the firmware binary in IDA.
  • Set Thumb mode to T=0x1 (Alt-g).
  • Set the compiler options (Options -> Compiler...) to GNU C++.
  • Run the script file pdom/import/main.py (File -> Script file).
  • Select an patch.elf file (Select file).
  • Import it (Import ELF). After a few seconds you will have sections and function names.
  • Select a reference database, which should be the PDOM that belongs to your firmware binary.
  • Select multiple additional databases, and the importer will pick the best combined matches.
  • Import it (Import PDOM). This will take a while.
  • You can also import a hardware register file 20739mapb0.h to name hardware registers (Import map.h).

This script was tested on IDA Pro 7.4 and 7.5.

Recommended IDA Pro workflow

After some internal testing, we can recommend the following workflow when working with IDA Pro and Polypyus:

  • Create a fresh database. ARM v7 little endian, ARM Cortex M for the Bluetooth firmware.
  • Mark position 0x0 as Thumb (Alt-g, T=0x1).
  • Create ROM and RAM segments. ROM at 0x0 with rx, RAM at 0x200000 with rwx (at least for the Bluetooth firmware).
  • Create vector table offsets in ROM, at least for the reset vector, which is a 4-byte offset at 0x4 (o). On the CYW20735 firmware it points to 0x3bc+1. Go back one byte and create a function (p).
  • Wait for auto analysis to finish.
  • Import Polypyus results.
  • Run the Thumbs Up scripts.
  • Run both BinDiff and Diaphora. The latter ideally in an IDA version with decompiler. Use both, as they use different heuristics.

...now your IDA database might be somewhat useful :) Still a lot of things the disassembler fails at within ARM Thumb2 but way better than anything IDA does on its own.

Broadcom Bluetooth firmware history

The firmware folder contains various firmware with and without symbols. Everything in the history contains symbols, everything in targets is without symbols.

History
Chip Device Build Date Symbols
BCM20703A2 MacBook/iMac 2016-2017 Oct 22 2015 βœ”
CYW20719B1 Evaluation board Jan 17 2017 βœ”
CYW20735B1 Evaluation board Jan 18 2018 βœ”
CYW20819A1 Evaluation board May 22 2018 βœ”
Targets
Chip Device Build Date Symbols
BCM2046A2 iMac Late 2009 2007? -
BCM2070B0 MacBook 2011, Thinkpad T420 Jul 9 2008 -
BCM20702A1 Asus USB Dongle Feb (?) 2010 -
BCM4345B0 iPhone 6 Jul 15 2013 -
BCM4335C0 Google Nexus 5 Dec 11 2012 -
BCM4345B0 Google Nexus 6P / Galaxy S6 Oct 23 2014 -
BCM43430A1 Raspberry Pi 3 and Zero W Jun 2 2014 -
BCM4345C0 Raspberry Pi 3+ and 4 Aug 19 2014 -
BCM4347B0 Samsung Galaxy S8 series Jun 3 2016 -
BCM4375B1 Samsung Galaxy S10/20 series Apr 13 2018 -
BCM4378B1 iPhone 11/SE2 Oct 25 2018 Strings

For the Samsung series, S8 also includes the Note 8 and S8+ etc., and the S10/S20 also includes everything from the S10e up to the Note 20 5G.

Dump quality might vary, some are with RAM and some are just the ROM. We have access to most of the devices in this list. If you need a dump with most recent patch levels and including RAM, feel free to ping us.

A few devices mentioned in the paper are not included here, since these might not be research-only devices etc. A few iPhones and MacBooks are missing as well, since we have them as research-only devices but the original dump wasn't. These devices will be added soon :)

Contributing

There is an .editorconfig file in this repository. It configures indention style, charset and line separators. Follow this configuration when contributing, which can be made easier if you use an IDE plugin for .editorconfig.

How to install test and development dependencies

To install test dependencies execute

pip install '.[test]'

this will install packages that are only needed for executing test cases.

Developement dependencies, provide for example stubs for package types. To install them run

pip install '.[development]'

Testing

pytest will run all tests.

Locally testing against different versions of Python

The project uses tox to locally run the tests against different versions of python. Tox is setup to test against versions 3.6, 3.7, 3.8 and 3.9 To run tox install the test dependencies and install these 4 mentioned versions of python. Our recommended way to install and manage several versions of python on is pyenv.

Steps:

  1. Install Pyenv
  2. Install Pyenv virtualenv
  3. Run
    pyenv install 3.9.1
    pyenv install 3.8.6
    pyenv install 3.7.9
    pyenv install 3.6.12
    pyenv virtualenv 3.9.1 polypyus
    penv local polypyus 3.8.6 3.7.9 3.6.12
    pip install '.[test]'
    pip install '.[development]'
  4. Run tox

Local automation

Polypyus uses GitHub Actions for automated test runs and some linting. If you want you can run the linting steps locally with on pre-commit git hooks.

Every time before a new commit is created this will trigger the linting and show the issues that would prevent this code from succeed in the GitHub Actions linting step. It will also format changed files with black .

pip install '.[development]'
pre-commit install

License and credits

We thank Anna Stichling for creating the Polypyus logo. We also thank Christian Blichmann and Joxean Koret for their feedback.

Polypyus is open-source and licensed under the GPLv3.

More Repositories

1

opendrop

An open Apple AirDrop implementation written in Python
Python
8,572
star
2

openhaystack

Build your own 'AirTags' 🏷 today! Framework for tracking personal Bluetooth devices via Apple's massive Find My network.
Swift
8,225
star
3

nexmon

The C-based Firmware Patching Framework for Broadcom/Cypress WiFi Chips that enables Monitor Mode, Frame Injection and much more
C
2,406
star
4

AirGuard

Protect yourself from being tracked 🌍 by AirTags 🏷 and Find My accessories πŸ“
Kotlin
1,904
star
5

owl

An open Apple Wireless Direct Link (AWDL) implementation written in C
C
1,217
star
6

openwifipass

An open source implementation of Apple's Wi-Fi Password Sharing protocol in Python.
Python
802
star
7

mobisys2018_nexmon_software_defined_radio

Proof of concept project for operating Broadcom Wi-Fi chips as arbitrary signal transmitters similar to software-defined radios (SDRs)
Shell
763
star
8

internalblue

Bluetooth experimentation framework for Broadcom and Cypress chips.
Python
684
star
9

frankenstein

Broadcom and Cypress firmware emulation for fuzzing and further full-stack debugging
C
430
star
10

nexmon_csi

Channel State Information Extraction on Various Broadcom Wi-Fi Chips
C
302
star
11

toothpicker

Python
234
star
12

privatedrop

Practical Privacy-Preserving Authentication for Apple AirDrop
Swift
217
star
13

BTLEmap

Nmap for Bluetooth Low Energy
Swift
159
star
14

bcm-rpi3

DEPRECATED: Monitor Mode and Firmware patching framework for the Raspberry Pi 3, development moved to: https://github.com/seemoo-lab/nexmon
C
157
star
15

airtag

AirTag instrumentation including AirTechno and firmware downgrades.
JavaScript
138
star
16

wireshark-awdl

Wireshark Dissector for Apple Wireless Direct Link (AWDL) and Apple's CoreCapture logging framework. Note: the AWDL dissector is part of Wireshark 3.0!
134
star
17

VirtFuzz

VirtFuzz is a Linux Kernel Fuzzer that uses VirtIO to provide inputs into the kernels subsystem. It is built with LibAFL.
Rust
109
star
18

frida-scripts

JavaScript
101
star
19

mobisys2018_nexmon_channel_state_information_extractor

Example project for extracting channel state information of up to 80 MHz wide 802.11ac Wi-Fi transmissions using the BCM4339 Wi-Fi chip of Nexus 5 smartphones.
MATLAB
98
star
20

airdrop-keychain-extractor

Extracting Apple ID Validation Record, Certificate, and Key for AirDrop
Objective-C
96
star
21

bcm-public

DEPRECATED: Monitor Mode and Firmware patching framework for the Google Nexus 5, development moved to: https://github.com/seemoo-lab/nexmon
C
75
star
22

fitness-app

Java
70
star
23

apple-continuity-tools

Reverse engineering toolkit for Apple's wireless ecosystem
JavaScript
63
star
24

nexmon_debugger

Debugger with hardware breakpoints and memory watchpoints for BCM4339 Wi-Fi chips
C
54
star
25

satellite-messenger

A free satellite messenger for iPhone 14
Swift
50
star
26

wisec2017_nexmon_jammer

This project contains the nexmon-based source code required to repeat the experiments of our WiSec 2017 paper.
C
48
star
27

aristoteles

A Wireshark dissector for the Apple Remote Invocation (ARI) protocol, used between Intel base band chips and the iOS CommCenter for various management purposes, SMS, telephony and much more.
Lua
45
star
28

talon-tools

Talon Tools: The Framework for Practical IEEE 802.11ad Research
TeX
41
star
29

mmTrace

mmTrace: Millimeter Wave Propagation Simulation
MATLAB
40
star
30

fitness-firmware

HTML
40
star
31

AirGuard-iOS

Protect yourself from being tracked πŸ“by Samsung SmartTags and Tile Trackers
Swift
39
star
32

apple_u1

JavaScript
38
star
33

chirpotle

A LoRaWAN Security Evaluation Framework
Jupyter Notebook
35
star
34

dtrace-memaccess_cve-2020-27949

C++
35
star
35

proxawdl

Tunnels a regular TCP connection through an AWDL link by exploiting the NetService API
Objective-C
31
star
36

pyshimmer

pyshimmer provides a Python API to work with the wearable sensor devices produced by Shimmer.
Python
24
star
37

mobisys2018_nexmon_covert_channel

Wi-Fi based covert channel that hides information in hand crafted acknowledgement frames imitating additional channel effects that can be extracted from channel state information at the intended receiver.
C
23
star
38

uwb-sniffer

A UWB Sniffer with accurate timestamps
C
22
star
39

h4bcm_wireshark_dissector

Wireshark dissector for Broadcom specific H4 diagnostic commands
C
22
star
40

owlink.org

Opening up Apple's wireless ecosystem around the Apple Wireless Direct Link (AWDL) protocol
HTML
20
star
41

wisec2017_nexmon_jammer_demo_app

This project contains source code of our Nexmon-based jammer app presented as a demo at WiSec 2017.
Java
19
star
42

nexmon-arc

The nexmon C-based firmware patching framework adapted for the ARC architecture.
C
19
star
43

seemoo-mobile-sensing

Sensor data collector for Android devices
Java
19
star
44

plist17lib

Python
18
star
45

BTLEmap-Framework

BTLEmap's Bluetooth Low Energy framework that powers the app
Swift
17
star
46

csicloak

Python
15
star
47

seemoo-wearable-sensing

Sensor data collector for Samsung Gear S3
JavaScript
15
star
48

talon-sector-patterns

Antenna Sector Patterns as obtained by Measurements in the CoNEXT'17 paper
MATLAB
14
star
49

pairsonic

Helping groups securely exchange contact information.
Dart
13
star
50

fido2ext

Bring Your Own FIDO2 Extensions!
JavaScript
12
star
51

wifi-password-sharing

An open source implementation of Apple's Wi-Fi Password Sharing protocol in Swift.
Swift
12
star
52

privatefind

Lost and Found: Stopping Bluetooth Finders from Leaking Private Information
C
12
star
53

nexmon_tx_task

Scheduled frame transmission on Broadcom Wi-Fi Chips
C
11
star
54

pico-nexmon

Applications for the Raspberry Pi Pico W related to Nexmon the C-based firmware patching framework for Broadcom/Cypress WiFi chips.
CMake
11
star
55

wisec2017_nexmon_jammer_demo_firmware

This project contains the nexmon-based source code of the jammer used in our WiSec 2017 demo Android app.
C
10
star
56

bcm_misc

10
star
57

opennan

OpenNAN - An open source NAN stack for Linux
C
9
star
58

Hardwhere

snipeit-it based asset management app
Kotlin
8
star
59

ubicomp19_zero_interaction_security

Source code for experiments and evaluation of five zero-interaction security schemes, for our Ubicomp 2019 paper "Perils of Zero-Interaction Security in the Internet of Things"
Jupyter Notebook
8
star
60

offline-finding-evaluation

Quantitative analysis of location reports from Apple's offline finding (OF) location tracking system
Jupyter Notebook
7
star
61

myo-keylogging

Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper
Python
7
star
62

natural-disaster-mobility

Natural Disaster Mobility Model and Scenarios in the ONE
Java
6
star
63

wisec2017_nexmon_jammer_reproducibility

This project contains all measured data and scripts to recreate the plots used in our WiSec 2017 paper.
MATLAB
6
star
64

nexmon_energy_measurement

This repository contains patched Linux kernel sources to run energy measurements on the Wi-Fi chip of a Nexus 5 smartphone.
C
6
star
65

next2you

Source code for experiments and evaluation of Next2You copresence detection scheme, for our TIOT 2021 paper "Next2You: Robust Copresence Detection Based on Channel State Information".
C
6
star
66

d11-emu

D11emu: A BCM43 D11 Emulation Framework
Rust
6
star
67

aic-prototype

Proof of concept implementation of Acoustic Integrity Codes (AICs) for Android smartphones
Kotlin
6
star
68

CellGuard

CellGuard is a research project that analyzes how cellular networks are operated and possibly surveilled
5
star
69

powerpc-ose

C++
5
star
70

PrivateDrop-Base

The framework that powers PrivateDrop
C
4
star
71

fastzip

Source code for experiments and evaluation of FastZIP zero-interaction pairing scheme, for our Mobisys 2021 paper "FastZIP: Faster and More Secure Zero-Interaction Pairing".
Python
4
star
72

graphics

3
star
73

tpy

A Lightweight Framework for Agile Distributed Network Experiments
Python
3
star
74

wintech23_nexmon_d11debug

Pawn
3
star
75

woot24_cfi_coverage_tools

The artifacts for the 'On the Effectiveness of CFI in Practice' paper to be published at WOOT'24.
Python
2
star
76

click-castor

Click implementation of LIDOR/SEMUD (based on the Castor routing protocol)
C++
2
star
77

privatedrop-evaluation

Jupyter Notebook
2
star
78

wisec23-speaker-bootstrapping

Software repository for our WiSec '23 demo: Secure Bootstrapping of Smart Speakers Using Acoustic Communication
C
2
star
79

caret

CARET: The Crisis and Resilience Evaluation Tool
Python
2
star
80

hardzipa

Source code for experiments and evaluation of HardZiPA system for our EWSN 2023 paper "Hardening and Speeding UpZero-interaction Pairing and Authentication".
Python
2
star
81

kardia-demod

Python
1
star
82

talon-library-measurements

Large-Scale Talon Measurements at Library
1
star
83

handoff-authentication-swift

C++
1
star
84

wintech2017_nexmon_ping_offloading

This project contains the nexmon-based source code of the ping offloading application used in our WiNTECH 2017 paper.
C
1
star
85

python-msp430-tools

This is a fork of the original python-msp430-tools repository on Launchpad. It features a patchset that is required to use the tools with the Shimmer3 devices.
Python
1
star
86

Please-Unstalk-Me

User Data and Online Survey results
Jupyter Notebook
1
star