• Stars
    star
    148
  • Rank 244,597 (Top 5 %)
  • Language
    Dockerfile
  • License
    Apache License 2.0
  • Created almost 3 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A practical experiment on supply-chain security using reproducible builds

I probably didn't backdoor this

This is a practical attempt at shipping a program and having reasonably solid evidence there's probably no backdoor. All source code is annotated and there are instructions explaining how to use reproducible builds to rebuild the artifacts distributed in this repository from source.

The idea is shifting the burden of proof from "you need to prove there's a backdoor" to "we need to prove there's probably no backdoor". This repository is less about code (we're going to try to keep code at a minimum actually) and instead contains technical writing that explains why these controls are effective and how to verify them. You are very welcome to adopt the techniques used here in your projects.

The author should be assumed to be your average software developer, who might be suspiciously good with computer security, but doesn't have nation-state capabilities.

Contents

Preparing retroactive reviews

Since "reading the source code" requires advanced domain knowledge, this section describes a pen-and-paper aproach that can be used to cryptographically ensure you can retro-actively review what you executed, even if you didn't review before you executed it. Pen-and-paper should be taken literally here to ensure this can't be modified by software. If done correctly, you don't need to read the other sections immediately, instead you're creating an immutable papertrail that can later be used by a subject matter expert. Note that the review needs to happen on a different computer than the one that executed the code, for safety reasons.

Because it's in the authors interest to prove there are no backdoors, all external resources that are not contained within this repository need to be referred to in a way that's addressing its content (more on this in the next section).

We're starting with the main repository by cloning it and showing the commit hash we're about to work with:

$ git clone https://github.com/kpcyrd/i-probably-didnt-backdoor-this
$ cd i-probably-didnt-backdoor-this/
$ git rev-parse HEAD
aabbccddeeff00112233445566778899aabbccdd

The hash in the last line is going to be different for you. This 40 character id is what you need for your paper trail, you need to write this down (preferably along with the current date) and keep it in a safe location. It needs to be protected from undetected tampering but isn't secret, so you may create copies or even post it publicly.

This id uniquely identifies all files in this repository with their content. If a file is modified/removed/added/renamed in this repository, this hash changes too.

If you want to read more about the cryptographic properties behind this, look into Merkel trees.

Pinned external resources

In the previous section we've described how git is automatically tracking the content of all files in this repository with a single hash. Software projects often rely on external resources downloaded from the internet, like libraries.

Downloading resources from the internet doesn't weaken what we've established in the previous section, as long as:

  1. The content of the resource is pinned with a cryptographic hash and the hash is recorded in the git repository.
  2. We can be reasonably sure the resource is not going to disappear. If they disappear you could attempt to use backup copies, as long as they match the cryptographic hash in the repository.

If at least one of those two doesn't apply we "broke the chain of custody".

We don't have to implement this ourselves, but cargo and docker implement this internally.

Reading the source code

The repository contains 6 source code files, there's a writeup for each of them. Files ending with .md are documentation.

  • Cargo.toml - Contains metadata about the project and a list of dependencies (if any)
  • Cargo.lock - Automatically generated, records sha256 checksums for all dependencies
  • src/main.rs - The actual source code of our program
  • Makefile - A wrapper script with build instructions
  • Dockerfile - Contains build instructions for a container image
  • PKGBUILD - Contains build instructions for an Arch Linux package

Reproducing the ELF binary

The binary is built in a docker container, the exact command can be found in the Makefile. Running make executes the build in a specific Docker image (the official rust 1.54.0 alpine 3.14 docker image).

Because the build environment is pinned and there's nothing introducing non-determinism to the build (like recording the build time), running the build on different computers (or even operating systems) should always result in the same binary.

Start the build with this command:

$ make

This command should finish quite quickly and produces a binary that matches this checksum:

$ b2sum target/x86_64-unknown-linux-musl/release/asdf
cd112870cdf12052e5604e7559e45f95cac4e52a45e91c9d9285a22a82c6392e95fbf0dc5f784837e7769a3ce14c898c866a85e4d60b051d3416875e301e28aa  target/x86_64-unknown-linux-musl/release/asdf

Downloading and hashing the pre-compiled binary from the releases page should give you an identical hash:

$ curl -LsS 'https://github.com/kpcyrd/i-probably-didnt-backdoor-this/releases/download/v0.1.1/asdf' | b2sum -
cd112870cdf12052e5604e7559e45f95cac4e52a45e91c9d9285a22a82c6392e95fbf0dc5f784837e7769a3ce14c898c866a85e4d60b051d3416875e301e28aa  -

If you get the same checksum you've successfully reproduced the binary. If there's no difference between the pre-compiled binary and the one you built yourself this means the pre-compiled binary is just as trustworthy as the one you built yourself.

Reproducing the Docker image

There's a Dockerfile in the repository that always produces the same bit-for-bit identical image. It's a multi-stage build, so it builds the binary in one temporary image and then creates the real image with just FROM, COPY and ENTRYPOINT. The build environment is virtually identical to what we're using in the previous section, then we're copying it over into an Alpine image that's pinned by its sha256 hash.

$ make docker
sudo buildah bud --timestamp 0 --tag asdf
[1/2] STEP 1/4: FROM docker.io/rust@sha256:8463cc29a3187a10fc8bf5200619aadf78091b997b0c3941345332a931c40a64
[1/2] STEP 2/4: WORKDIR /app
[1/2] STEP 3/4: COPY . .
[1/2] STEP 4/4: RUN cargo build --release --locked --target=x86_64-unknown-linux-musl
    Finished release [optimized] target(s) in 0.02s
[2/2] STEP 1/3: FROM docker.io/alpine@sha256:eb3e4e175ba6d212ba1d6e04fc0782916c08e1c9d7b45892e9796141b1d379ae
[2/2] STEP 2/3: COPY --from=0 /app/target/x86_64-unknown-linux-musl/release/asdf /asdf
[2/2] STEP 3/3: ENTRYPOINT ["/asdf"]
[2/2] COMMIT asdf
Getting image source signatures
Copying blob bc276c40b172 skipped: already exists
Copying blob 7d377d49a080 done
Copying config f0b71b1591 done
Writing manifest to image destination
Storing signatures
--> f0b71b1591c
Successfully tagged localhost/asdf:latest
f0b71b1591cf50cf3609494187083741c1021fd99f6168ab8283c4390954cef1

The last line is the hash of the image we just built. We're using buildah to build the image because there's no way to set the layer timestamp with docker (causing the hash to vary). Unfortunately buildah records it's version, this image has been built with 1.22.3.

The pre-compiled images can be found on the container registry (also linked in the side-bar on the right). Pull the image with this command:

$ docker pull ghcr.io/kpcyrd/i-probably-didnt-backdoor-this:latest
latest: Pulling from kpcyrd/i-probably-didnt-backdoor-this
50341f5fa632: Already exists
163594b80890: Pull complete
Digest: sha256:11cc7ec2b907a325fa3565039d990a466a7d83a06aa7dffdebba38d495d1571d
Status: Downloaded newer image for ghcr.io/kpcyrd/i-probably-didnt-backdoor-this:latest
ghcr.io/kpcyrd/i-probably-didnt-backdoor-this:latest

You'll noticed the hash doesn't seem to match at first, but if everything worked the image id is indeed the same:

$ docker images --no-trunc ghcr.io/kpcyrd/i-probably-didnt-backdoor-this
REPOSITORY                                      TAG       IMAGE ID                                                                  CREATED        SIZE
ghcr.io/kpcyrd/i-probably-didnt-backdoor-this   latest    sha256:f0b71b1591cf50cf3609494187083741c1021fd99f6168ab8283c4390954cef1   51 years ago   9.38MB

Reproducing the Arch Linux package

There's a custom Arch Linux repository that's distributing a pre-built package:

[i-probably-didnt-backdoor-this]
Server = https://pkgbuild.com/~kpcyrd/$repo/os/$arch/

This package can be reproduced from source, the full writeup for this can be found in this document.

Notes on security patches

We've pinned very specific versions in multiple places (including the compiler). This is often considered bad style since we're now in charge of keeping all of this updated.

If you're adopting this in your own project you should periodically release new versions, even if you aren't making any changes to the code anymore. This also applies to many modern programming ecosystems these days due to lock files.

The following places need to be updated occasionally, causing the artifact hashes to change.

  • Dependencies in Cargo.toml/Cargo.lock (if any, cargo update)
  • FROM lines in Dockerfile (docker pull rust:alpine, docker pull alpine:latest)
  • The build image in the Makefile (docker pull rust:alpine)

How is this related to Reproducible Builds

There's quite a bit of overlap with the reproducible builds project. The techniques used to rebuild the binary artifacts are only possible because the builds for this project are reproducible.

This project also attempts to exclusively use binaries distributed by high-profile targets like Alpine Linux and the Rust project. This is commonly accepted as "reasonable" in the wider tech industry, but makes their build servers and signing keys extremely valuable.

The reproducible builds effort attempts to reduce this risk by allowing independent parties to "reproduce" their packages with "confirmation rebuilds", just like you did when following the instructions here!

Similar work

Acknowledgments

This project was funded by Google, The Linux Foundation, and people like you and me through GitHub sponsors.

โ™ฅ๏ธโ™ฅ๏ธโ™ฅ๏ธ

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.

More Repositories

1

sn0int

Semi-automatic OSINT framework and package manager
Rust
1,559
star
2

sniffglue

Secure multithreaded packet sniffer
Rust
958
star
3

rshijack

tcp connection hijacker, rust rewrite of shijack
Rust
395
star
4

authoscope

Scriptable network authentication cracker (formerly `badtouch`)
Rust
359
star
5

rebuilderd

Independent verification of binary packages - reproducible builds
Rust
327
star
6

mini-docker-rust

Very small rust docker image
Dockerfile
171
star
7

spotify-launcher

Client for spotify's apt repository in Rust for Arch Linux
Rust
152
star
8

nude-rs

High performance nudity detection in rust
Rust
125
star
9

sh4d0wup

Signing-key abuse and update exploitation framework
Rust
119
star
10

libredefender

Imagine the information security compliance guideline says you need an antivirus but you run Arch Linux
Rust
117
star
11

ipfs.ink

PROJECT HAS BEEN SHUTDOWN - Publish and render markdown essays to and from ipfs
JavaScript
109
star
12

pacman-bintrans

Experimental binary transparency for pacman with sigstore and rekor
Rust
83
star
13

boxxy-rs

Linkable sandbox explorer
Rust
70
star
14

acme-redirect

Tiny http daemon that answers acme challenges and redirects everything else to https
Rust
68
star
15

arch-audit-gtk

Arch Linux Security Update Notifications
Rust
55
star
16

narnia

๐Ÿšง EXPERIMENTAL ๐Ÿšง Secure hidden service webserver
Rust
49
star
17

yrd

cjdns swiss army knife
Python
48
star
18

repro-env

Dependency lockfiles for reproducible build environments ๐Ÿ“ฆ๐Ÿ”’
Rust
33
star
19

archlinux-userland-fs-cmp

Forensic tool to read all installed packages from a mounted Arch Linux drive and compare the filesystem to a trusted source
Rust
32
star
20

defcon26-pow

Fast defcon 26 quals pow solver
Rust
26
star
21

backseat-signed

Authenticate the cryptographic chain-of-custody of Linux distributions (like Arch Linux and Debian) to their source code inputs
Rust
25
star
22

what-the-src

Source code of https://whatsrc.org/
Rust
24
star
23

progpick

Bruteforce with a stream of permutations of a specific pattern
Rust
23
star
24

syscallz-rs

Simple seccomp library for rust
Rust
22
star
25

sn0int-modules

Lua
21
star
26

cargo-debstatus

cargo-tree for debian packaging
Rust
20
star
27

tr1pd

tamper resistant audit log
Rust
17
star
28

forensic-adb

Tokio based client library for the Android Debug Bridge (adb) based on mozdevice
Rust
16
star
29

snail

Parasitic network manager
Rust
15
star
30

rocket_failure

Semantic error handling for rocket applications
Rust
15
star
31

auth-tarball-from-git

Authenticate a tarball through a signed tag in a git repository (with reproducible builds)
Rust
15
star
32

apt-swarm

๐Ÿฅธ p2p gossip network for update transparency, based on pgp ๐Ÿฅธ
Rust
15
star
33

worker-ratelimit

General purpose rate limiting library for Cloudflare Workers
Rust
14
star
34

laundry5

Shuffles your socks - rotating proxy frontend server
Rust
13
star
35

kmod-rs

Bindings to libkmod to manage linux kernel modules
Rust
13
star
36

ismyarchverifiedyet

๐Ÿšง Experimental script to query rebuilderd for results ๐Ÿšง
Python
13
star
37

chrootable-https

Sandbox+chroot friendly https client
Rust
12
star
38

brchd

Data exfiltration toolkit
Rust
12
star
39

nmcssh

Solving Zooko's triangle for ssh authentication
Python
11
star
40

updlockfiles

Manage lockfiles in PKGBUILDs for upstreams that don't ship them, `updpkgsums` for dependency trees (Arch Linux tooling)
Rust
11
star
41

booty

Minimal forensic/exfiltration/evil-maid/rescue live boot system
Shell
10
star
42

burritun

Wrap a tun device in a tap device
Rust
10
star
43

archlinux-inputs-fsck

Lint repository of PKGBUILDs for cryptographically pinned inputs
Rust
10
star
44

rp2040-37c3-oled

Pure Rust firmware for 37c3 logo animation (waveshare-rp2040-zero with 128x64 oled screen - i2c sda: gpio14, scl: gpio15)
Rust
10
star
45

nessus-rs

Nessus Vulnerability Scanner API client
Rust
8
star
46

a2p

fancy html5 file upload, webrtc seeding swarm, auto torrent and scp interface
JavaScript
7
star
47

homeassistant-rs

home-assistant api client
Rust
6
star
48

masshype

Util for massive cjdns routers
JavaScript
6
star
49

memry

mem'ry, tar pipe curl
JavaScript
6
star
50

stalkerware-indicators-rs

Parser for Echap's stalkerware-indicators repo
Rust
6
star
51

summarize-cli

Attempt to summarize text from `stdin`, using a large language model (locally and offline), to `stdout`
Rust
6
star
52

signal-whois

Resolve a signal username or link to a signal uuid
Rust
6
star
53

sloppy-rfc4880

Pure rust parser for RFC-4880 (OpenPGP Message Format)
Rust
5
star
54

rebuilderd-debian-buildinfo-crawler

Reproducible Builds: Scraper/Parser for https://buildinfos.debian.net into structured data
Rust
5
star
55

syrup-rs

Simple abstraction around pancurses for chat-like interfaces
Rust
5
star
56

autovoice

irc bot to automatically give +v to users after they've been in the channel for some time
Rust
5
star
57

signal-doh-ech

๐Ÿšง Experimental source dump for pluggable transport for signal-desktop, not fully implemented, do not use in production ๐Ÿšง
Rust
5
star
58

archlinux-linux-reproducible

Binary reproducible fork of the Arch Linux kernel package
Shell
4
star
59

46snihdnat

4 to 6 server name indication hybrid destination network address translation
JavaScript
4
star
60

mrsc

mpsc with requests
Rust
4
star
61

tls.li

Hardened TLS configuration examples
CSS
4
star
62

cjdns-rs

Admin API implementation of cjdns
Rust
4
star
63

os-version

Rust
4
star
64

webhook-server

Multiprocess sandboxed webhook daemon
Rust
4
star
65

ipfs-mirror

ipfs mirror utils with leveldb cache for immutable files
Python
4
star
66

archlinux-scan-malloc-usable-size

Scan the symbols of all ELF binaries in all Arch Linux packages for usage of malloc_usable_size (-D_FORTIFY_SOURCE=3 compatibility)
Rust
4
star
67

game-dont-panic

Pure Rust firmware, bare metal Space Invaders/Endoparasitic crossover game for waveshare-rp2040-zero, with a 128x64 OLED i2c screen, a rotary encoder and a button
Rust
4
star
68

elf2nucleus

Integrate micronucleus into the cargo buildsystem, flash an AVR firmware from an elf file
Rust
4
star
69

promisc

cjdns peering bot
Python
3
star
70

jenkins-debian

personal fork of jenkins.debian.net
Shell
3
star
71

onionjson

Tor2Web for json
HTML
3
star
72

wrbt-web

Web implementation of wrbt
HTML
3
star
73

csrf.fun

Cross Site Request Forgery Debugger
JavaScript
3
star
74

hype-qr

QRify cjdns connect strings
JavaScript
3
star
75

d3xs

Physical access control (Rust firmware)
Rust
3
star
76

annex-accumulate

Super folder for git-annex drives
Python
3
star
77

cloudflare-worker-rust

Build a Hello World WebAssembly web-service with Rust and run it locally with Cloudflare's workerd
Rust
3
star
78

sn0int-signal

Rust
2
star
79

huesaverd

Rust
2
star
80

abuild-reusesig

Rust
2
star
81

embedded-triple

Embed the target triple into the binary
Rust
2
star
82

s2ws

Expose Spawn to WebSockets
JavaScript
2
star
83

ysf-sn0int-modules

my sn0int modules or patches
Lua
2
star
84

attiny85-hello-world

Hello World Rust firmware for digispark attiny85 microcontroller
Rust
2
star
85

BadCrypto

A challenge for my future self
Python
1
star
86

shepard

The hackers monitoring
Python
1
star
87

labsh

Restricted shell for docker build server
Rust
1
star
88

dotfiles

Shell
1
star
89

waflz

Link preview irc bot
Rust
1
star
90

scdoc

personal mirror
C
1
star
91

PKGBUILD-acmetool

Shell
1
star
92

rust-diesel-bug-2365

Rust
1
star
93

kpcyrd

1
star
94

pkgbuild-signal-desktop

Send pull requests for the signal-desktop Arch Linux package here
Shell
1
star
95

updvcspins

Manage pinned VCS repositories in PKGBUILDs (Arch Linux tooling)
Rust
1
star
96

wrbt-httpd

Authorize peering requests on remote servers
Python
1
star
97

namecoin-zones

Converts the namecoin blockchain to dns zones
Python
1
star
98

not-butter

there is no butter
JavaScript
1
star
99

aur-repro

Reproducible Builds for packages in the Arch User Repository (AUR)
Shell
1
star
100

iam

Simple whois server implementation
Shell
1
star