• Stars
    star
    159
  • Rank 235,916 (Top 5 %)
  • Language
    Python
  • License
    Other
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Binary delta encoding tools.

nala

About

Binary delta encoding in Python 3.6+ and C.

Based on http://www.daemonology.net/bsdiff/ and HDiffPatch, with the following features:

  • bsdiff, hdiffpatch and match-blocks algorithms.

  • sequential, hdiffpatch or in-place (resumable) patch types.

  • BZ2, LZ4, LZMA, Zstandard, heatshrink or CRLE compression.

  • Sequential patches allow streaming.

  • Maximum file size is 2 GB for the bsdiff algorithm. There is practically no limit for the hdiffpatch and match-blocks algorithms.

  • Incremental apply patch implemented in C, suitable for memory constrained embedded devices. Only the sequential patch type is supported.

  • SA-IS or divsufsort instead of qsufsort for bsdiff.

  • Optional experimental data format aware algorithm for potentially smaller patches. I don't recommend anyone to use this functionality as the gain is small in relation to memory usage and code complexity!

    There is a risk this functionality uses patent https://patents.google.com/patent/EP1988455B1/en. Anyway, this patent expires in August 2019 as I understand it.

    Supported data formats:

    • ARM Cortex-M4
    • AArch64

Project homepage: https://github.com/eerimoq/detools

Documentation: http://detools.readthedocs.org/en/latest

Installation

pip install detools

Statistics

Patch sizes, memory usage (RSS) and elapsed times when creating a patch from Python-3.7.3.tar (79M) to Python-3.8.1.tar (84M) for various algorithm, patch type and compression combinations.

See tests/benchmark.sh for details on how the data was collected.

Algorithm Patch type Compr. Patch size RSS Time
bsdiff sequential lzma 3,5M 662M 0:24.29
bsdiff sequential none 86M 646M 0:15.20
hdiffpatch hdiffpatch lzma 2,4M 523M 0:13.74
hdiffpatch hdiffpatch none 7,2M 523M 0:10.24
match-blocks sequential lzma 2,9M 273M 0:08.57
match-blocks sequential none 84M 273M 0:01.72
match-blocks hdiffpatch lzma 2,6M 212M 0:06.07
match-blocks hdiffpatch none 9,7M 212M 0:01.30

Same as above, but for MicroPython ESP8266 binary releases (from 604k to 615k).

Algorithm Patch type Compr. Patch size RSS Time
bsdiff sequential lzma 71K 46M 0:00.64
bsdiff sequential none 609K 27M 0:00.33
hdiffpatch hdiffpatch lzma 65K 42M 0:00.37
hdiffpatch hdiffpatch none 123K 25M 0:00.32
match-blocks sequential lzma 194K 46M 0:00.44
match-blocks sequential none 606K 25M 0:00.22
match-blocks hdiffpatch lzma 189K 43M 0:00.38
match-blocks hdiffpatch none 313K 24M 0:00.19

Example usage

Examples in C are found in c.

Command line tool

The create patch subcommand

Create a patch foo.patch from tests/files/foo/old to tests/files/foo/new.

$ detools create_patch tests/files/foo/old tests/files/foo/new foo.patch
Successfully created 'foo.patch' in 0.01 seconds!
$ ls -l foo.patch
-rw-rw-r-- 1 erik erik 127 feb  2 10:35 foo.patch

Create the same patch as above, but without compression.

$ detools create_patch --compression none \
      tests/files/foo/old tests/files/foo/new foo-no-compression.patch
Successfully created 'foo-no-compression.patch' in 0 seconds!
$ ls -l foo-no-compression.patch
-rw-rw-r-- 1 erik erik 2792 feb  2 10:35 foo-no-compression.patch

Create a hdiffpatch patch foo-hdiffpatch.patch.

$ detools create_patch --algorithm hdiffpatch --patch-type hdiffpatch \
      tests/files/foo/old tests/files/foo/new foo-hdiffpatch.patch
Successfully created patch 'foo-hdiffpatch.patch' in 0.01 seconds!
$ ls -l foo-hdiffpatch.patch
-rw-rw-r-- 1 erik erik 146 feb  2 10:37 foo-hdiffpatch.patch

Lower memory usage with --algorithm match-blocks algorithm. Mainly useful for big files. Creates slightly bigger patches than bsdiff and hdiffpatch.

$ detools create_patch --algorithm match-blocks \
      tests/files/foo/old tests/files/foo/new foo-hdiffpatch-64.patch
Successfully created patch 'foo-hdiffpatch-64.patch' in 0.01 seconds!
$ ls -l foo-hdiffpatch-64.patch
-rw-rw-r-- 1 erik erik 404 feb  8 11:03 foo-hdiffpatch-64.patch

Non-sequential but smaller patch with --patch-type hdiffpatch.

$ detools create_patch \
      --algorithm match-blocks --patch-type hdiffpatch \
      tests/files/foo/old tests/files/foo/new foo-hdiffpatch-sequential.patch
Successfully created 'foo-hdiffpatch-sequential.patch' in 0.01 seconds!
$ ls -l foo-hdiffpatch-sequential.patch
-rw-rw-r-- 1 erik erik 389 feb  8 11:05 foo-hdiffpatch-sequential.patch

The create in-place patch subcommand

Create an in-place patch foo-in-place.patch.

$ detools create_patch_in_place --memory-size 3000 --segment-size 500 \
      tests/files/foo/old tests/files/foo/new foo-in-place.patch
Successfully created 'foo-in-place.patch' in 0.01 seconds!
$ ls -l foo-in-place.patch
-rw-rw-r-- 1 erik erik 672 feb  2 10:36 foo-in-place.patch

The create bsdiff patch subcommand

Create a bsdiff patch foo-bsdiff.patch, compatible with the original bsdiff program.

$ detools create_patch_bsdiff \
      tests/files/foo/old tests/files/foo/new foo-bsdiff.patch
Successfully created 'foo-bsdiff.patch' in 0 seconds!
$ ls -l foo-bsdiff.patch
-rw-rw-r-- 1 erik erik 261 feb  2 10:36 foo-bsdiff.patch

The apply patch subcommand

Apply the patch foo.patch to tests/files/foo/old to create foo.new.

$ detools apply_patch tests/files/foo/old foo.patch foo.new
Successfully created 'foo.new' in 0 seconds!
$ ls -l foo.new
-rw-rw-r-- 1 erik erik 2780 feb  2 10:38 foo.new

The in-place apply patch subcommand

Apply the in-place patch foo-in-place.patch to foo.mem.

$ cp tests/files/foo/in-place-3000-500.mem foo.mem
$ detools apply_patch_in_place foo.mem foo-in-place.patch
Successfully created 'foo.mem' in 0 seconds!
$ ls -l foo.mem
-rw-rw-r-- 1 erik erik 3000 feb  2 10:40 foo.mem

The bsdiff apply patch subcommand

Apply the patch foo-bsdiff.patch to tests/files/foo/old to create foo.new.

$ detools apply_patch_bsdiff tests/files/foo/old foo-bsdiff.patch foo.new
Successfully created 'foo.new' in 0 seconds!
$ ls -l foo.new
-rw-rw-r-- 1 erik erik 2780 feb  2 10:41 foo.new

The patch info subcommand

Print information about the patch foo.patch.

$ detools patch_info foo.patch
Type:               sequential
Patch size:         127 bytes
To size:            2.71 KiB
Patch/to ratio:     4.6 % (lower is better)
Diff/extra ratio:   9828.6 % (higher is better)
Size/data ratio:    0.3 % (lower is better)
Compression:        lzma

Number of diffs:    2
Total diff size:    2.69 KiB
Average diff size:  1.34 KiB
Median diff size:   1.34 KiB

Number of extras:   2
Total extra size:   28 bytes
Average extra size: 14 bytes
Median extra size:  14 bytes

Contributing

  1. Fork the repository.

  2. Install prerequisites.

    pip install -r requirements.txt
    
  3. Implement the new feature or bug fix.

  4. Implement test case(s) to ensure that future changes do not break legacy.

  5. Run the tests.

    make test
    
  6. Create a pull request.

More Repositories

1

gqt

Build and execute GraphQL queries in the terminal.
Python
461
star
2

simba

Simba Embedded Programming Platform.
C
339
star
3

monolinux

Create embedded Linux systems with a single statically linked executable.
Makefile
324
star
4

asn1tools

ASN.1 parsing, encoding and decoding.
Python
290
star
5

monolinux-jiffy

A Monolinux distro for the Jiffy board!
C
154
star
6

moblin

Moblin, a free iOS app for IRL streaming.
Swift
133
star
7

bitstruct

Python bit pack/unpack package.
C
120
star
8

bincopy

Mangling of various file formats that conveys binary information (Motorola S-Record, Intel HEX, TI-TXT, Verilog VMEM, ELF and binary files).
Python
102
star
9

dbg-macro

A set of dbg(…) macros for C
C
79
star
10

pbtools

Google Protocol Buffers tools (C code generator).
C
72
star
11

nala

🦁 Nala - A delightful test framework for C projects.
C
69
star
12

pumbaa

Python on Simba.
C
62
star
13

mqttools

MQTT version 5.0 client and broker using asyncio
Python
61
star
14

hardware-reference

Various documents.
55
star
15

textparser

A text parser.
Python
29
star
16

async

🔀 Asynchronous framework in C.
C
26
star
17

pyfuzzer

Fuzz test Python modules with libFuzzer
Python
24
star
18

asyncudp

Asyncio high level UDP sockets.
Python
24
star
19

asyncbg

Asyncio background tasks
Python
16
star
20

monolinux-raspberry-pi-3

A Monolinux distro for Raspberry Pi 3!
C
15
star
21

bitstream

A bit stream library for C.
C
15
star
22

messi

⚽ Reliable message passing in distributed systems.
C
14
star
23

pictools

Microchip PIC tools for software developers.
C
13
star
24

ecdtools

Electronic circuit design tools.
Python
10
star
25

monolinux-c-library

The Monolinux C library.
C
9
star
26

traceback

Colorful stack traceback in C on Linux.
C
9
star
27

soundid

Sound identification.
Python
7
star
28

humanfriendly

Human friendly C library.
C
7
star
29

monolinux-example-project

A Monolinux example project.
C
6
star
30

irwin

Plotting data in the terminal
Python
5
star
31

expect

Programmed dialogue with interactive streams.
Python
5
star
32

bunga

Control and monitor your system.
C
5
star
33

advent-of-code

https://adventofcode.com/
Python
4
star
34

systest

System test framework.
Python
4
star
35

simba-esp32

ESP32 for Simba
C
4
star
36

moblin-remote-control-relay

Moblin Remote Control Relay
JavaScript
4
star
37

argparse_addons

Additional Python argparse types and actions.
Python
3
star
38

terminal_graphics

Who knows?!?
Python
3
star
39

uml

Unified Modeling Language (UML)
Python
2
star
40

obs-remote-control-relay

OBS Remote Control Relay
JavaScript
2
star
41

romeo

C
2
star
42

drmario

Dr. Mario OBS plugin.
CMake
2
star
43

httpasync

HTTP Async
Python
2
star
44

avr-toolchain-windows

AVR toolchain for Windows
C
2
star
45

rafiki

Rust on Simba.
Rust
2
star
46

moblin_assistant

Moblin remote control assistant.
Python
2
star
47

Rist

librist Swift wrapper
Swift
1
star
48

monolinux-rust-jiffy

Monolinux in Rust for the Jiffy board
Dockerfile
1
star