• Stars
    star
    368
  • Rank 111,775 (Top 3 %)
  • Language
    C
  • License
    GNU General Publi...
  • Created over 2 years ago
  • Updated 20 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

a file system for mounting container images

composefs

The composefs project combines several underlying Linux features to provide a very flexible mechanism to support read-only mountable filesystem trees, stacking on top of an underlying "lower" Linux filesystem.

The key technologies composefs uses are:

  • overlayfs as the kernel interface
  • EROFS for a mountable metadata tree
  • fs-verity (optional) from the lower filesystem

The manner in which these technologies are combined is important. First, to emphasize: composefs does not store any persistent data itself. The underlying metadata and data files must be stored in a valid "lower" Linux filesystem. Usually on most systems, this will be a traditional writable persistent Linux filesystem such as ext4, xfs,, btrfs etc.

Separation between metadata and data

A key aspect of the way composefs works is that it's designed to store "data" (i.e. non-empty regular files) distinct from "metadata" (i.e. everything else).

composefs reads and writes a filesystem image which is really just an EROFS which today is loopback mounted.

However, this EROFS filesystem tree is just metadata; the underlying non-empty data files can be shared in a distinct "backing store" directory. The EROFS filesystem includes trusted.overlay.redirect extended attributes which tell the overlayfs mount how to find the real underlying files.

Mounting multiple composefs with a shared backing store

The key targeted use case for composefs is versioned, immutable executable filesystem trees (i.e. container images and bootable host systems), where some of these filesystems may share parts of their storage (i.e. some files may be different, but not all).

Composefs ships with a mount helper that allows you to easily mount images by pass the image filename and the base directory for the content files like this:

# mount -t composefs /path/to/image  -o basedir=/path/to/content /mnt

By storing the files content-addressed (e.g. using the hash of the content to name the file) shared files need only be stored once, yet can appear in multiple mounts.

Backing store shared on disk and in page cache

A crucial advantage of composefs in contrast to other approaches is that data files are shared in the page cache.

This allows launching multiple container images that will reliably share memory.

Filesystem integrity

Composefs also supports fs-verity validation of the content files. When using this, the digest of the content files is stored in the image, and composefs will validate that the content file it uses has a matching enabled fs-verity digest. This means that the backing content cannot be changed in any way (by mistake or by malice) without this being detected when the file is used.

You can also use fs-verity on the image file itself, and pass the expected fs-verity digest as a mount option, which composefs will validate. In this case we have full trust of both data and metadata of the mounted file. This solves a weakness that fs-verity has when used on on its own, in that it can only verify file data, not metadata.

Usecase: container images

There are multiple container image systems; for those using e.g. OCI a common approach (implemented by both docker and podman for example) is to just untar each layer by itself, and then use overlayfs to stitch them together at runtime. This is a partial inspiration for composefs; notably this approach does ensure that identical layers are shared.

However if instead we store the file content in a content-addressed fashion, and then we can generate a composefs file for each layer, continuing to mount them with a chain of overlayfs or we can generate a single composefs for the final merged filesystem tree.

This allows sharing of content files between images, even if the metadata (like the timestamps or file ownership) vary between images.

Together with something like zstd:chunked this will speed up pulling container images and make them available for usage, without the need to even create these files if already present!

Usecase: Bootable host systems (e.g. OSTree)

OSTree already uses a content-addressed object store. However, normally this has to be checked out into a regular directory (using hardlinks into the object store for regular files). This directory is then bind-mounted as the rootfs when the system boots.

OSTree already supports enabling fs-verity on the files in the store, but nothing can protect against changes to the checkout directories. A malicious user can add, remove or replace files there. We want to use composefs to avoid this.

Instead of checking out to a directory we generate a composefs image pointing into the object store and mount that as the root fs. We can then enable fs-verity of the composefs image and embed the digest of that in the kernel commandline which specifies the rootfs. Since composefs generation is reproducible, we can even verify that the composefs image we generated is correct by comparing its digest to one in the ostree metadata that was generated when the ostree image was built.

For more information on ostree and composefs, see this tracking issue.

tools

Composefs installs two main tools:

  • mkcomposefs: Creates a composefs image given a directory pathname. Can also compute digests and create a content store directory.
  • mount.composefs: A mount helper that supports mounting composefs images.

mounting a composefs image

The mount.composefs helper allows you to mount composefs images (of both types).

The basic use is:

# mount -t composefs /path/to/image.cfs -o basedir=/path/to/datafiles  /mnt

The default behaviour for fs-verity is that any image files that specifies an expected digest needs the backing file to match that fs-verity digest, at least if this is supported in the kernel. This can be modified with the verity and noverity options.

Mount options:

  • basedir: is the directory to use as a base when resolving relative content paths.
  • verity: All image files must specify a fs-verity image.
  • noverity: Don't verfy fs-verity digests (useful for example if fs-verity is not supported on basedir).
  • digest: A fs-verity sha256 digest that the image file must match. If set, verity_check defaults to 2.
  • signed: The image file must contain an fs-verity signature.
  • upperdir: Sepcify an upperdir for the overlayfs filesystem.
  • workdir: Sepcify an upperdir for the overlayfs filesystem.
  • idmap: Specify a path to a user namespace that is useda as an idmap.

Experimental user space tools

The directory tools/ contains some experimental user space tools to work with composefs images.

  • composefs-from-json: convert from a CRFS metadata file to the binary blob.
  • ostree-convert-commit.py: converts an OSTree commit into a CRFS config file that writer-json can use.

More Repositories

1

podman

Podman: A tool for managing OCI containers and pods.
Go
21,713
star
2

skopeo

Work with remote images registries - retrieving information, images, signing content
Go
7,355
star
3

buildah

A tool that facilitates building OCI images.
Go
7,003
star
4

youki

A container runtime written in Rust
Rust
5,793
star
5

podman-compose

a script to run docker-compose.yml using podman
Python
4,567
star
6

podman-desktop

Podman Desktop - A graphical tool for developing on containers and Kubernetes
TypeScript
4,145
star
7

bubblewrap

Low-level unprivileged sandboxing tool used by Flatpak and similar projects
C
3,601
star
8

crun

A fast and lightweight fully featured OCI runtime and C library for running containers
C
2,759
star
9

toolbox

Tool for interactive command line environments on Linux
Shell
2,284
star
10

krunvm

Create microVMs from OCI images
Rust
1,315
star
11

image

Work with containers' images
Go
822
star
12

libkrun

A dynamic library providing Virtualization-based process isolation capabilities
Rust
655
star
13

storage

Container Storage Library
Go
522
star
14

podman-tui

Podman Terminal UI
Go
494
star
15

fuse-overlayfs

FUSE implementation for overlayfs
C
476
star
16

netavark

Container network stack
Rust
457
star
17

udica

This repository contains a tool for generating SELinux security profiles for containers
Python
425
star
18

conmon

An OCI container runtime monitor.
C
395
star
19

build

another build tool for container images (archived, see https://github.com/rkt/rkt/issues/4024)
Go
341
star
20

quadlet

C
330
star
21

oci-seccomp-bpf-hook

OCI hook to trace syscalls and generate a seccomp profile
Go
280
star
22

podman.io_old

Repository for podman.io website using GitHub Pages.
CSS
259
star
23

bootc

Boot and upgrade via container images
Rust
250
star
24

ansible-podman-collections

Repository for Ansible content that can include playbooks, roles, modules, and plugins for use with the Podman tool
Python
233
star
25

container-selinux

SELinux policy files for Container Runtimes
Roff
227
star
26

podman-py

Python bindings for Podman's RESTful API
Python
215
star
27

gvisor-tap-vsock

A new network stack based on gVisor
Go
212
star
28

dnsname

name resolution for containers
Go
178
star
29

oci-spec-rs

OCI Runtime, Image and Distribution Spec in Rust
Rust
173
star
30

common

Location for shared common files in github.com/containers repos.
Go
161
star
31

conmon-rs

An OCI container runtime monitor written in Rust
Rust
157
star
32

aardvark-dns

Authoritative dns server for A/AAAA container records. Forwards other request to host's /etc/resolv.conf
Rust
151
star
33

docker-lvm-plugin

Docker volume plugin for LVM volumes
Go
148
star
34

virtcontainers

A Go package for building hardware virtualized container runtimes
Go
140
star
35

containrs

General purpose container library
Rust
122
star
36

ocicrypt

Encryption libraries for Encrypted OCI Container images
Go
121
star
37

fetchit

FetchIt is used to manage the life cycle and configuration of Podman containers
Go
109
star
38

prometheus-podman-exporter

Prometheus exporter for podman environments exposing containers, pods, images, volumes and networks information.
Go
105
star
39

bluechi

Bluechi is a systemd service controller intended for multi-node environments with a predefined number of nodes and with a focus on highly regulated ecosystems such as those requiring functional safety.
C
96
star
40

crun-vm

An OCI Runtime that enables Podman, Docker, and Kubernetes to run VM images.
Rust
78
star
41

Demos

Repository is a location of user demos for technologies listed on github.com/containers
Shell
77
star
42

shortnames

Shortnames project is collecting registry alias names for shortnames to fully specified container image names.
Python
66
star
43

libkrunfw

A dynamic library bundling the guest payload consumed by libkrun
C
59
star
44

psgo

A ps(1) AIX-format compatible golang library
Go
56
star
45

python-podman

Python bindings and code examples for using Varlink access to Podman Service
Python
50
star
46

nri-plugins

A collection of community maintained NRI plugins
Go
48
star
47

libocispec

a C library for accessing OCI runtime and image spec files
Python
47
star
48

tar-diff

Go
46
star
49

selinuxd

A daemon that manages SELinux policies on a filesystem
Go
37
star
50

podman-desktop-swift

Swift
34
star
51

podman.io

The new podman.io design project
TypeScript
33
star
52

podman-security-bench

Shell
33
star
53

initoverlayfs

C
32
star
54

nydus-storage-plugin

A storage plugin that provided CRI-O/Podman with the ability to lazy mount nydus images.
Go
30
star
55

buildah.io

Repository for the buildah.io web site using GitHub Pages.
HTML
27
star
56

BuildSourceImage

Tool to build a source image based on an existing OCI image
Shell
26
star
57

ansible-podman

Ansible podman is a package to allow ansible playbooks to manage podman containers
Python
26
star
58

oci-fetch

Simple command line tool for fetching the Open Container Initiative image format over various transports.
Go
23
star
59

podman-wsl-fedora

Root FS image of Fedora for Podman Machine on Windows
22
star
60

containertoolbx.org

Website for the Toolbx project
CSS
20
star
61

appstore

Example directory of Kubernetes YAML and Quadlets tested with Podman
Python
18
star
62

automation_images

Shell
17
star
63

containers-image-proxy-rs

containers-image-proxy-rs
Rust
17
star
64

qm

QM is a containerized environment for running Functional Safety qm (Quality Management) software
Shell
17
star
65

libhvee

Special purposed library for Windows HyperV control
Go
16
star
66

podman-machine-qemu

Qemu build for macOS used by the self-contained `podman machine` installer
Shell
16
star
67

oci-umount

C
13
star
68

docs

Repository for all documentation written about tools hosted at github.com/containers
Shell
12
star
69

automation

Automation scripts and configurations common across the containers org. repositories
Shell
12
star
70

netavark-dhcp-proxy-deprecated

DHCP proxy for Netavark
Rust
11
star
71

podman-desktop-extension-bootc

Support for bootable OS containers (bootc) and generating disk images
TypeScript
9
star
72

podhawk

Python
9
star
73

podman-machine-cni

Go
6
star
74

podman-desktop-catalog

Catalog of extensions of Podman Desktop
6
star
75

validator

C
6
star
76

arty

Arty is a tool for managing OCI Artifacts on OCI Registries.
6
star
77

automation_sandbox

Test-repository for experimenting with in-repo automation tools/settings.
Shell
4
star
78

kubensmnt

Shell
4
star
79

podman-desktop-e2e

Podman desktop e2e tests
Go
3
star
80

.github

The README for the containers organization
3
star
81

fetchit-desktop-extension

TypeScript
3
star
82

podman-desktop-extension-minikube

TypeScript
2
star
83

release-keys

2
star
84

podman-wsl-fedora-arm

Fedora ARM distribution for Podman Machine on WSL
2
star
85

ContainerPlumbing

Container Plumbing Conference information.
2
star
86

bootable

Ruby
2
star
87

nri-plugins-operator

Makefile
2
star
88

luksy

offline encryption/decryption using LUKS formats
Go
2
star
89

PodmanHello

Podman Hello Image Repository
Dockerfile
1
star
90

containers.github.io

HTML
1
star
91

winquit

golang module that supports graceful shutdown of Windows applications
Go
1
star
92

podman-installer

1
star
93

ai-lab-recipes

ai-studio-recipes
1
star