• Stars
    star
    131
  • Rank 274,847 (Top 6 %)
  • Language
    C
  • License
    Other
  • Created over 9 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Extended verification for git tags

git-evtag

git-evtag can be used as a replacement for git-tag -s. It will generate a strong checksum (called Git-EVTag-v0-SHA512) over the commit, tree, and blobs it references (and recursively over submodules). A primary rationale for this is that the underlying SHA1 algorithm of git is under increasing threat. Further, a goal here is to create a checksum that covers the entire source of a single revision as a replacement for tarballs + checksums.

git-evtag was originally discussed (long before the SHA1 collision of February 2017) on the git mailing list:

Getting git-evtag

See also the the Node.js implementation.

Using git-evtag

Create a new v2015.10 tag, covering the HEAD revision with GPG signature and Git-EVTag-v0-SHA512:

$ git-evtag sign v2015.10
 ( type your tag message, note a Git-EVTag-v0-SHA512 line in the message )
$ git show v2015.10
 ( Note signature covered by GPG signature )

Verify a tag:

$ git-evtag verify v2015.10
gpg: Signature made Sun 28 Jun 2015 10:49:11 AM EDT
gpg:                using RSA key 0xDC45FD5921C13F0B
gpg: Good signature from "Colin Walters <[email protected]>" [ultimate]
gpg:                 aka "Colin Walters <[email protected]>" [ultimate]
Primary key fingerprint: 1CEC 7A9D F7DA 85AB EF84  3DC0 A866 D7CC AE08 7291
     Subkey fingerprint: AB92 8A9C F8DD 0629 09C3  7BBD DC45 FD59 21C1 3F0B
Successfully verified: Git-EVTag-v0-SHA512: b05f10f9adb0eff352d90938588834508d33fdfcedbcfc332999ee397efa321d1f49a539f1b82f024111a281c1f441002e7f536b06eb04d41857b01636f6f268

Replacing tarballs - i.e. be the primary artifact

This is similar to what project distributors often accomplish by using git archive, or make dist, or similar tools to generate a tarball, and then checksumming that, and (ideally) providing a GPG signature covering it.

Tarball reproducibility

The problem with git archive and make dist is that tarballs (and other tools like zip files) are not easily reproducible exactly from a git repository commit. The authors of git reserve the right to change the file format output by git archive in the future. Also, there are a variety of reasons why compressors like gzip and xz aren't necessarily reproducible, such as compression levels, included timestamps, optimizations in the algorithm, etc. See Pristine tar for some examples of the difficulties involved (e.g. trying to retroactively guess the compression level arguments from the xz dictionary size).

If the checksum is not reproducible, it becomes much more difficult to easily and reliably verify that a generated tarball contains the same source code as a particular git commit.

What git-evtag implements is an algorithm for providing a strong checksum over the complete source objects for the target commit (+ trees + blobs + submodules). Then it's integrated with GPG for end-to-end verification. (Although, one could also wrap the checksum in X.509 or some other public/private signature solution).

Then no out of band distribution mechanism is necessary, and better, the checksums strengthen the ability to verify integrity of the git repository.

(And if you want to avoid downloading the entire history, that's what git clone --depth=1 is for.)

Git and SHA1

NEW! The first SHA1 collision was announced February 23, 2017:

Git uses a modified Merkle tree with SHA1, which means that if an attacker managed to create a SHA1 collision for a source file object (git blob), it would affect all revisions and checkouts - invalidating the security of all GPG signed tags whose commits point to that object.

Now, the author of this tool believes that today, GPG signed git tags are fairly secure, especially if one is careful to ensure transport integrity (e.g. pinned TLS certificates from the origin).

The Git-EVTag algorithm (v0)

There is currently only one version of the Git-EVTag algorithm, called v0 - and it only supports SHA-512. It is declared stable. All further text refers to this version of the algorithm. In the unlikely event that it is necessary to introduce a new version, this tool will support all known versions.

Git-EVTag-v0-SHA512 covers the complete contents of all objects for a commit; again similar to checksumming git archive, except reproducible. Each object is added to the checksum in its raw canonicalized form, including the header.

For a given commit (in Rust-style pseudocode):

fn git_evtag(repo: GitRepo, commitid: String) -> SHA512 {
    let checksum = new SHA512();
    walk_commit(repo, checksum, commitid)
    return checksum
}

fn walk_commit(repo: GitRepo, checksum : SHA512, commitid : String) {
    checksum_object(repo, checksum, commitid)
    let treeid = repo.load_commit(commitid).treeid();
    walk(repo, checksum, treeid)
}

fn checksum_object(repo: GitRepo, checksum: SHA512, objid: String) -> () {
    // This is the canonical header of the object; <typename> <length (ascii base 10)>
    // https://git-scm.com/book/en/v2/Git-Internals-Git-Objects#Object-Storage
    let header : &str = repo.load_object_header(objid);
    // The NUL byte after the header, explicitly included in the checksum
    let nul = [0u8];
    // The remaining raw content of the object as a byte array
    let body : &[u8] = repo.load_object_body(objid);
    
    checksum.update(header.as_bytes())
    checksum.update(&nul);
    checksum.update(body)
}

fn walk(repo: GitRepo, checksum: SHA512, treeid: String) -> () {
    // First, add the tree object itself
    checksum_object(repo, checksum, treeid);
    let tree = repo.load_tree(treeid);
    for child in tree.children() {
        match childtype {
            Blob(blobid) => checksum_object(repo, checksum, blobid),
            Tree(child_treeid) => walk(repo, checksum, child_treeid),
            Commit(commitid, path) => {
                let child_repo = repo.get_submodule(path)
                walk_commit(child_repo, checksum, commitid)
            }
        }
    }
}

This strong checksum, can be verified reproducibly offline after cloning a git repository for a particular tag. When covered by a GPG signature, it provides a strong end-to-end integrity guarantee.

It's quite inexpensive and practical to compute Git-EVTag-v0-SHA512 once per tag/release creation. At the time of this writing, on the Linux kernel (a large project by most standards), it takes about 5 seconds to compute on this author's laptop. On most smaller projects, it's completely negligible.

Aside: other aspects of tarballs

This project is just addressing one small part of the larger git/tarball question. Anything else is out of scope, but a brief discussion of other aspects is included below.

Historically, many projects include additional content in tarballs. For example, the GNU Autotools pregenerate a configure script from configure.ac and the like. Other projects don't include translations in git, but merge them out of band when generating tarballs.

There are many other things like this, and they all harm reproducibility and continuous integration/delivery.

For example, while many of my projects use Autotools, I simply have downstream authors run autogen.sh. It works just fine - the autotools are no longer changing often, and many downstreams want to do it anyways.

For the translation issue, note that bad translations can actually crash one's application. If they're part of the git repository, they can be more easily tested as a unit continuously.

More Repositories

1

cve-2020-14386

C
42
star
2

build-api

A lowest-common-denominator API description for meta-build systems to build individual components
24
star
3

coretoolbox

"pet container" tool using podman
Rust
23
star
4

fedora-silverblue-config

Copy of https://pagure.io/workstation-ostree-config/
HTML
23
star
5

dockerbase-minimal

Variant of CentOS core with just a minimal "yum -y install" in C
Python
17
star
6

xokdinst

Wrapper for openshift-installer
Rust
10
star
7

min-cloud-agent

Minimal client implementation of AWS/OpenStack metadata API
C
10
star
8

micro-yuminst

NOTE: moved to microdnf; previously: minimal implementation of yum -y install in C using libhif
Makefile
10
star
9

sync-fedora-ostree-containers

Shell
10
star
10

poky

Colin's Yocto repository
Python
9
star
11

homegit

My Home Directory
Python
8
star
12

intltool-git-mirror

Mirror of intltool bzr repository into git
C
7
star
13

coreos-diskimage-rehydrator

Part of implementing https://github.com/openshift/enhancements/pull/201
Rust
6
star
14

qcow2-to-vagrant

Convert a qcow2 into a Vagrant (libvirt) box
Shell
6
star
15

openshift-coreos-config

Ideally this becomes upstream of RHEL CoreOS config
5
star
16

dockerfiles

Random dockerfiles
Shell
5
star
17

dlayer-ostree

Import Docker images into an OSTree repository
C
4
star
18

container-cve-2021-22555

C
4
star
19

atomic-pkglayer

Demo of package layering
Python
3
star
20

ansible-personal

Colin's personal Ansible playbooks
Shell
3
star
21

libpwquality-git

Import of libpwquality into git
C
3
star
22

koji-sane-json-api

Proxy service offering a modern JSON API for reading Koji build metadata
Rust
3
star
23

liboauth-tarballs-as-git

Import of liboauth tarballs into git
Shell
3
star
24

flex

Mirror of flex CVS repository into git
C
3
star
25

dnfimage-config

Shell
2
star
26

sysmgmt-personal

Colin's personal provisoning bits
Shell
2
star
27

atomicws-productimg

Fork of Fedora Workstation's productimg with fixed partitionin
Python
2
star
28

pkgsys-ostree

pkgsys-ostree
C
2
star
29

glib-async-rm-rf

C
2
star
30

rpmostree-client-rs

Rust client bindings for coreos/rpm-ostree
Rust
2
star
31

spidermonkey-tarballs-as-git

Import of Mozilla Spidermonkey into git
JavaScript
2
star
32

iso-codes

Copy of iso-codes git
Shell
2
star
33

ostree-container

Tooling to map between container images and ostree
2
star
34

container-image-proxy

Small wrapper for containers/image which exposes a HTTP API to fetch
Go
2
star
35

try-setenv-rs

Split from https://internals.rust-lang.org/t/synchronized-ffi-access-to-posix-environment-variable-functions/
Rust
2
star
36

bootc-demo-base-images

Demonstration base images for use with bootc
1
star
37

meta-selinux

Branch of OE meta-selinux
Shell
1
star
38

hackmd

My hackmd.io docs
Makefile
1
star
39

exttests

Container for upstream coreos tests
Dockerfile
1
star
40

sync-ostree-to-containers

Rust
1
star
41

sqlite-tarballs-as-git

Import of SQLite tarballs as git
C
1
star
42

vala-tarballs-as-git

Import of vala tarballs into git
C
1
star
43

nspr-tarballs-as-git

Imports of Mozilla NSPR as git
C
1
star
44

vagrant-atomic-cluster

Project Atomic (Vagrant style)
Shell
1
star
45

guadec-2012-building-gnome

JavaScript
1
star
46

fcos-sigstore-demo

Dockerfile
1
star
47

fedora-silverblue-pipeline

Fork of fedora-coreos-pipeline for FSBCOS
Python
1
star
48

nss-tarballs-as-git

Imports of Mozilla NSS as git
C
1
star
49

containers-image-proxy-rs-orig

Rust bindings for the containers/image stack via skopeo
Rust
1
star
50

openshift-machine-bootimage-updater

Go
1
star
51

libical-tarballs-as-git

Import of libical into git
C
1
star
52

texinfo-git-mirror

Mirror of texinfo CVS into git
Perl
1
star
53

rofiles-fuse

FUSE filesystem that allows safe hardlink trees
1
star
54

rhel-coreos-bootimages

Stream metadata for RHEL CoreOS bootimages
Go
1
star
55

mygithubstatus

Playing around with github API
Rust
1
star