• Stars
    star
    465
  • Rank 94,287 (Top 2 %)
  • Language
    Rust
  • License
    Other
  • Created almost 8 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

an implementation of BLAKE3 verified streaming

Bao Β  Actions Status docs.rs crates.io

Bao Spec β€” Rust Crate β€” Rust Docs

Bao is an implementation of BLAKE3 verified streaming, as described in Section 6.4 of the BLAKE3 spec. Tree hashes like BLAKE3 make it possible to verify part of a file without re-hashing the entire thing, using an encoding format that stores the bytes of the file together with all the nodes of its hash tree. Clients can stream this encoding, or do random seeks into it, while verifying that every byte they read matches the root hash. For the details of how this works, see the Bao spec.

This project includes two Rust crates, the bao library crate and the bao_bin binary crate. The latter provides the bao command line utility.

Caution! Bao is beta cryptography software. It has not been formally audited yet.

Encoding and Decoding

Use case: A secure messaging app might support attachment files by including the hash of an attachment in the metadata of a message. With a serial hash, the recipient would need to download the entire attachment to verify it, but that can be impractical for things like large video files. With BLAKE3 and Bao, the recipient can stream a video attachment, while still verifying each byte as it comes in. (This scenario was the original motivation for the Bao project.)

# Create an input file that's a megabyte of random data.
> head -c 1000000 /dev/urandom > f

# Convert it into a Bao encoded file.
> bao encode f f.bao

# Compare the size of the two files. The encoding overhead is small.
> stat -c "%n %s" f f.bao | column -t
f       1000000
f.bao   1062472

# Compute the BLAKE3 hash of the original file. The `b3sum` tool would
# also work here.
> hash=`bao hash f`

# Stream decoded bytes from the encoded file, using the hash above.
> bao decode $hash < f.bao > f2
> cmp f f2

# Observe that using the wrong hash to decode results in an error. This
# is also what will happen if we use the right hash but corrupt some
# bytes in the encoded file.
> bad_hash="0000000000000000000000000000000000000000000000000000000000000000"
> bao decode $bad_hash < f.bao
Error: Custom { kind: InvalidData, error: StringError("hash mismatch") }

Verifying Slices

Encoded files support random seeking, but seeking might not be available or efficient over the network. (Note that one seek in the content usually requires several seeks in the encoding, as the decoder traverses the hash tree level-by-level.) In these situations, rather than trying to seek remotely, clients can instead request an encoded slice containing the range of content bytes they need. Creating a slice requires the sender to seek over the full encoding, but the recipient can then stream the slice without seeking at all. Decoding a slice uses the same root hash as regular decoding, so it doesn't require any preparation in advance from the sender or the recipient.

Use case: A BitTorrent-like application could fetch different slices of a file from different peers, without needing to define the slices ahead of time. Or a distributed file storage application could request random slices of an archived file from its storage providers, to prove that they're honestly storing the file, without needing to prepare or store challenges for the future.

# Using the encoded file from above, extract a 100 KB slice from
# somewhere in the middle. We'll use start=500000 (500 KB) and
# count=100000 (100 KB).
> bao slice 500000 100000 f.bao f.slice

# Look at the size of the slice. It contains the 100 KB of content plus
# some overhead. Again, the overhead is small.
> stat -c "%n %s" f.slice
f.slice 107272

# Using the same parameters we used to create the slice, plus the same
# hash we got above from the full encoding, decode the slice.
> bao decode-slice $hash 500000 100000 < f.slice > f.slice.out

# Confirm that the decoded output matches the corresponding section from
# the input file. (Note that `tail` numbers bytes starting with 1.)
> tail --bytes=+500001 f | head -c 100000 > expected.out
> cmp f.slice.out expected.out

# Now try decoding the slice with the wrong hash. Again, this will fail,
# as it would if we corrupted some bytes in the slice.
> bao decode-slice $bad_hash 500000 100000 < f.slice
Error: Custom { kind: InvalidData, error: StringError("hash mismatch") }

Outboard Mode

By default, all of the operations above work with a "combined" encoded file, that is, one that contains both the content bytes and the tree hash bytes interleaved. However, sometimes you want to keep them separate, for example to avoid duplicating a very large input file. In these cases, you can use the "outboard" encoding format, via the --outboard flag:

# Re-encode the input file from above in the outboard mode.
> bao encode f --outboard f.obao

# Compare the size of all these files. The size of the outboard file is
# equal to the overhead of the original combined file.
> stat -c "%n %s" f f.bao f.obao | column -t
f       1000000
f.bao   1062472
f.obao  62472

# Decode the whole file in outboard mode. Note that both the original
# input file and the outboard encoding are passed in as arguments.
> bao decode $hash f --outboard f.obao f4
> cmp f f4

Installation and Building From Source

The bao command line utility is published on crates.io as the bao_bin crate. To install it, add ~/.cargo/bin to your PATH and then run:

cargo install bao_bin

To build the binary directly from this repo:

git clone https://github.com/oconnor663/bao
cd bao/bao_bin
cargo build --release
./target/release/bao --help

tests/bao.py is a fully functional second implementation in Python, designed to be as short and readable as possible. It's a good starting point for understanding the algorithms involved, before diving into the Rust code.

More Repositories

1

duct.rs

a Rust library for running child processes
Rust
807
star
2

sha256_project

The SHA-256 Project, developed for NYU Tandon's Applied Cryptography course
Python
533
star
3

clinacl

a command line tool for playing with NaCl
Python
253
star
4

blake3-py

Python bindings for the BLAKE3 cryptographic hash function
Python
139
star
5

blake2_simd

high-performance implementations of BLAKE2b/s/bp/sp in pure Rust with dynamic SIMD
C
126
star
6

duct.py

a Python library for running child processes
Python
113
star
7

fbmessenger

[deprecated] a PyQt clone of Facebook Messenger for Windows
Python
102
star
8

os_pipe.rs

a cross-platform library for opening OS pipes in Rust
Rust
97
star
9

blake3-6502

the BLAKE3 hash function implemented in 6502 assembly
Assembly
56
star
10

shared_child.rs

a wrapper around std::process::Child that lets multiple threads wait or kill at once
Rust
39
star
11

applied_crypto_2021_fall

problem sets for CS-GY 6903 Applied Cryptography
Python
39
star
12

bessie

an authenticated, chunked cipher based on BLAKE3
Rust
21
star
13

pure_python_blake3

a pure Python implementation of BLAKE3
Python
15
star
14

dotfiles

Jack's config files
Shell
12
star
15

blake3_reference_impl_c

a C port of the BLAKE3 Rust reference implementation
C
10
star
16

blake2_c.rs

a safe wrapper around the BLAKE2 C implementation (deprecated in favor of blake2b_simd and blake2s_simd)
Rust
7
star
17

riddance

a reservable, retiring, recyclable slotmap/arena (WIP)
Rust
6
star
18

kangarootwelve_xkcp.rs

A Rust wrapper around the XKCP implementation of the KangarooTwelve hash function
C
6
star
19

unsafe_rust_is_not_c_talk

6
star
20

founder

A wrapper around fzf and fd, which keeps track of files you've opened before
Rust
6
star
21

arch

Jack's install scripts for Arch Linux.
Shell
5
star
22

bao_presentation

HTML
5
star
23

avx512_test

Rust
4
star
24

bao_experiments

benchmarks for various design changes to github.com/oconnor663/bao
HTML
4
star
25

rust-examples

scary things that Rust keeps you safe from
Rust
4
star
26

mersenne_breaker

Rust
4
star
27

copy_in_place

[deprecated] a single-function Rust crate providing a safe wrapper around ptr::copy for efficient copying within slices
Rust
4
star
28

jacko.io

source files for my personal website
Rust
4
star
29

blake3_aead

experimental
Rust
4
star
30

chacha20_simd

experimental
Rust
4
star
31

pure_python_salsa_chacha

pure Python implementations of Salsa20, XSalsa20, ChaCha20 (IETF), and XChaCha20
Python
4
star
32

simd_examples

Rust
3
star
33

happy_eyeballs

A demo comparison between futures in Rust and in Python's Trio
Rust
3
star
34

thread_tester

dummy project
Rust
3
star
35

blake2s_simd

DEPRECATED
Rust
3
star
36

keybase_validator

Rust
3
star
37

zsh-sensible

zsh defaults that everyone can agree on
Shell
3
star
38

atomic_examples

Rust
3
star
39

async_demo

teaching examples for async Rust
Rust
3
star
40

baokeshed

Rust
3
star
41

easy_steps_chinese_vocab

Python
3
star
42

test

funny filenames
3
star
43

cpp_rust_talk

https://youtu.be/IPmRDS0OSxM
JavaScript
3
star
44

tosqlite

Python
3
star
45

cooper_pair_box

Jack's physics senior project. A model of the inductively-shunted Cooper pair box.
Python
2
star
46

cell_talk

JavaScript
2
star
47

dungeon_of_despair

a great text-based dungeon crawler from our undergraduate years
Scheme
2
star
48

qol-armor

Lua
2
star
49

supercop

Jack's working copy of the SUPERCOP source tree, for BLAKE3 submissions (see crypto_hash/blake3/)
C
2
star
50

rust_blake3_c_bindings

C
2
star
51

scratch

2
star
52

unsound_helpers

Rust
2
star
53

incremental_messagepack

Example implementation of an incremental MessagePack decoder.
Python
2
star
54

appveyor-exec-py

reproducing a bug where python scripts don't execute on AppVeyor
Python
2
star
55

cryptopals-rust

HTML
2
star
56

pooter

for Mi
Objective-C
2
star
57

basex_gmp

C
2
star
58

clap_unicode_test

exercising a bug in the Rust Clap library
Rust
2
star
59

palindrompit

A palindromic tip calculator for Android
Scala
2
star
60

pyo3_pybuffer_repro

repro code for a memory corruption issue with PyO3 + PyPy
Rust
1
star
61

broken-make

a demo of a broken Make build
C
1
star
62

blog_os

Rust
1
star
63

actions-test

messing around with GitHub Actions
Rust
1
star
64

curve25519

code from an interview, not interesting :)
Python
1
star
65

rust-practice

Rust
1
star
66

fsanitize_example

Rust
1
star
67

ttt

Tic-Tac-Toe
Python
1
star
68

codingexamples

Python
1
star
69

yalie

Yet Another Lisp Interpreting Experiment: Jack's comp sci senior project. An interpreter for an object-oriented Lisp.
Python
1
star
70

cell_utils

EXPERIMENTAL CODE, not published on crates.io
Rust
1
star
71

go_calling_rust_example

Go
1
star
72

find_xor_collisions

Rust
1
star
73

peru-server

a Flask web site that can host peru modules and generate a peru.yaml for you
Python
1
star
74

peru-pygit2-example

an example project using peru and make to build against pygit2
Makefile
1
star
75

cc_hello_world

minimal demo project for testing `cc` changes
Rust
1
star