• Stars
    star
    1,742
  • Rank 25,705 (Top 0.6 %)
  • Language
    OCaml
  • License
    ISC License
  • Created about 11 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Irmin is a distributed database that follows the same design principles as Git
Irmin logo
A Distributed Database Built on the Same Principles as Git

OCaml-CI Build Status codecov GitHub release (latest by date) docs


Irmin is an OCaml library for building mergeable, branchable distributed data stores.

Irmin is based on distributed version-control systems (DVCs), extensively used in software development to enable developers to keep track of change provenance and expose modifications in the source code. Irmin applies DVC's principles to large-scale distributed data and exposes similar functions to Git (clone, push, pull, branch, rebase). It is highly customizable: users can define their types to store application-specific values and define custom storage layers (in memory, on disk, in a remote Redis database, in the browser, etc.). The Git workflow was initially designed for humans to manage changes within source code. Irmin scales this to hanlde automatic programs performing a very high number of operations per second, with a fully automated handling of update conflicts. Finally, Irmin exposes an event-driven API to define programmable dynamic behaviours and to program distributed dataflow pipelines.

Irmin was created at the University of Cambridge in 2013 to be the default storage layer for MirageOS applications (both to store and orchestrate unikernel binaries and the data that these unikernels are using). As such, Irmin is not, strictly speaking, a complete database engine. Instead, similarly to other MirageOS components, it is a collection of libraries designed to solve different flavours of the challenges raised by the CAP Theorem. Each application can select the right combination of libraries to solve its particular distributed problem.

Irmin consists of a core of well-defined low-level data structures that specify how data should be persisted and be shared across nodes. It defines algorithms for efficient synchronization of those distributed low-level constructs. It also builds a collection of higher-level data structures that developers can use without knowing precisely how Irmin works underneath. Some of these components even have a formal semantics, including Conflict-free Replicated Data-Types (CRDT). Since it's a part of MirageOS, Irmin does not make strong assumptions about the OS environment that it runs in. This makes the system very portable: it works well for in-memory databases and slower persistent serialization such as SSDs, hard drives, web browser local storage, or even the Git file format.

Irmin is primarily developed and maintained by Tarides, with contributions from many contributors from various organizations. External maintainers and contributors are welcome.

Features

  • Built-in Snapshotting - backup and restore
  • Storage Agnostic - you can use Irmin on top of your own storage layer
  • Custom Datatypes - (de)serialization for custom data types, derivable via ppx_irmin
  • Highly Portable - runs anywhere from Linux to web browsers and Xen unikernels
  • Git Compatibility - irmin-git uses an on-disk format that can be inspected and modified using Git
  • Dynamic Behavior - allows the users to define custom merge functions, use in-memory transactions (to keep track of reads as well as writes) and to define event-driven workflows using a notification mechanism

Documentation

API documentation can be found online at https://mirage.github.io/irmin

Installation

Prerequisites

Please ensure to install the minimum opam and ocaml versions. Find the latest version and install instructions on ocaml.org.

To install Irmin with the command-line tool and all unix backends using opam:

  opam install irmin-cli

A minimal installation containing the reference in-memory backend can be installed by running:

  opam install irmin

The following packages have are available on opam:

  • irmin - the base package, plus an in-memory storage implementation
  • irmin-chunk - chunked storage
  • irmin-cli - a simple command-line tool
  • irmin-fs - filesystem-based storage using bin_prot
  • irmin-git - Git compatible storage
  • irmin-graphql - GraphQL server
  • irmin-http - a simple REST interface
  • irmin-mirage - mirage compatibility
  • irmin-mirage-git - Git compatible storage for mirage
  • irmin-mirage-graphql - mirage compatible GraphQL server
  • irmin-pack - compressed, on-disk, posix backend
  • ppx_irmin - PPX deriver for Irmin content types (see README_PPX.md)
  • irmin-containers - collection of simple, ready-to-use mergeable data structures

To install a specific package, simply run:

  opam install <package-name>

Development Version

To install the development version of Irmin in your current opam switch, clone this repository and opam install the packages inside:

  git clone https://github.com/mirage/irmin
  cd irmin/
  opam install .

Usage

Example

Below is a simple example of setting a key and getting the value out of a Git-based, filesystem-backed store.

open Lwt.Syntax

(* Irmin store with string contents *)
module Store = Irmin_git_unix.FS.KV (Irmin.Contents.String)

(* Database configuration *)
let config = Irmin_git.config ~bare:true "/tmp/irmin/test"

(* Commit author *)
let author = "Example <[email protected]>"

(* Commit information *)
let info fmt = Irmin_git_unix.info ~author fmt

let main =
  (* Open the repo *)
  let* repo = Store.Repo.v config in

  (* Load the main branch *)
  let* t = Store.main repo in

  (* Set key "foo/bar" to "testing 123" *)
  let* () =
    Store.set_exn t ~info:(info "Updating foo/bar") [ "foo"; "bar" ]
      "testing 123"
  in

  (* Get key "foo/bar" and print it to stdout *)
  let+ x = Store.get t [ "foo"; "bar" ] in
  Printf.printf "foo/bar => '%s'\n" x

(* Run the program *)
let () = Lwt_main.run main

The example is contained in examples/readme.ml It can be compiled and executed with dune:

$ dune build examples/readme.exe
$ dune exec examples/readme.exe
foo/bar => 'testing 123'

The examples directory also contains more advanced examples, which can be executed in the same way.

Command-line

The same thing can also be accomplished using irmin, the command-line application installed with irmin-cli, by running:

$ echo "root: ." > irmin.yml
$ irmin init
$ irmin set foo/bar "testing 123"
$ irmin get foo/bar
testing 123

irmin.yml allows for irmin flags to be set on a per-directory basis. You can also set flags globally using $HOME/.irmin/config.yml. Run irmin help irmin.yml for further details.

Also see irmin --help for list of all commands and either irmin <command> --help or irmin help <command> for more help with a specific command.

Context

Irmin's initial desing is directly inspired from XenStore, with:

  • the need for efficient optimistic concurrency control features to be able to let thousands of virtual machine concurrently access and modify a central configuration database (the Xen stack uses XenStore as an RPC mechanism to setup VM configuration on boot). Very early on, the initial focus was to specify and handle potential conflicts when the optimistic assumptions do not usually work so well.
  • the need for a convenient way to debug and audit possible issues that might happen in that system. Our initial experiments showed that it was possible to design a reliable system using Git as backend to persist configuation data reliably (to safely restart after a crash), while making system debugging easy and go really fast, thanks to efficient merging strategy.

In 2014, the first release of Irmin was announced part of the MirageOS 2.0 release here. Since then, several projects started using and improving Irmin. These can roughly be split into 3 categories: (i) use Irmin as a portable, structured key-value store (with expressive, mergeable types); (ii) use Irmin as distributed database (with a customizable consistency semantics) and (iii) an event-driven dataflow engine.

Irmin as a portable and efficient structured key-value store

  • XenStored is an information storage space shared between all the Xen virtual machines running in the same host. Each virtual machines gets its own path in the store. When values are changed in the store, the appropriate drivers are notified. The initial OCaml implementation was later extended to use Irmin here. More details here.
  • Jitsu is an experimental orchestrator for unikernels. It uses Irmin to store the unikernel configuration (and manage dynamic DNS entries). See more details here.
  • Cuekeeper is a web-based GTD (a fancy TODO list) that runs entirely in the browser. It uses Irmin in the browser to store data locally, with support for structured concurrent editing and snapshot export and import. More details here.
  • Canopy and Unipi both use Irmin to serve static websites pull from Git repositories and deployed as unikernels.
  • Caldav is using Irmin to store calendar entries and back them into a Git repository. More information here.
  • Datakit was developed at Docker and provided a 9p interface to the Irmin API. It was used to manage the configuration of Docker for Desktop, with merge policies on upgrade, full auditing, and snapshot/rollback capabilites.
  • Tezos started using Irmin in 2017 to store the ledger state. The first prototype used irmin-git before switching to irmin-lmdb and irmin-leveldb (and now irmin-pack). More details here.

Irmin as a distributed store

  • An IMAP server using Irmin to store emails. More details here. The goal of that project was both to use Irmin to store emails (so using Irmin as a local key-value store) but also to experiment with replacing the IMAP on-wire protocol by an explicit Git push/pull mechanism.
  • irmin-ARP uses Irmin to store and audit ARP configuration. It's using Irmin as a local key-value store for very low-level information (which are normally stored very deep in the kernel layers), but the main goal was really to replace the broadcasting on-wire protocol by point-to-point pull/push synchronisation primitives, with a full audit log of ARP operations over a network. More details here.
  • Banyan uses Irmin to implement a distributed cache over a geo-replicated cluster. It's using Cassandra as a storage backend. More information here.
  • irmin-fdb implements an Irmin store backed by FoundationDB. More details here.

Irmin as a dataflow scheduler

  • Datakit CI is a continuous integration service that monitors GitHub project and tests each branch, tag and pull request. It displays the test results as status indicators in the GitHub UI. It keeps all of its state and logs in DataKit, rather than a traditional relational database, allowing review with the usual Git tools. The core of the project is a scheduler that manage dataflow pipelines across Git repositories. It was used for a few years as the CI system test Docker for Desktop on bare-metal and virtual machines, as well as all the new opam package submissions to ocaml/opam-repository. More details here.
  • Causal RPC implements an RPC framework using Irmin as a network substrate. More details here.
  • CISO is an experimental (distributed) Continuous Integration engine for OPAM. It was designed as a replacement of Datakit-CI and finally turned into ocurrent.

Issues

Feel free to report any issues using the GitHub bugtracker.

License

See the LICENSE file.

Acknowledgements

Development of Irmin was supported in part by the EU FP7 User-Centric Networking project, Grant No. 611001.

More Repositories

1

mirage

MirageOS is a library operating system that constructs unikernels
OCaml
2,417
star
2

ocaml-cohttp

An OCaml library for HTTP clients and servers using Lwt or Async
OCaml
644
star
3

alcotest

A lightweight and colourful test framework
OCaml
413
star
4

ocaml-git

Pure OCaml Git format and protocol
OCaml
349
star
5

mirage-tcpip

TCP/IP networking stack in pure OCaml, using the Mirage platform libraries. Includes IPv4/6, ICMP, and UDP/TCP support.
OCaml
321
star
6

jitsu

A DNS server that automatically starts unikernels on demand
OCaml
308
star
7

mirage-skeleton

Examples of simple MirageOS apps
OCaml
210
star
8

qubes-mirage-firewall

A Mirage firewall VM for QubesOS
OCaml
201
star
9

mirage-www

Website infrastructure and content for mirage.io
HTML
162
star
10

decompress

Pure OCaml implementation of Zlib.
OCaml
116
star
11

ocaml-cow

Caml on the Web (COW) is a set of parsers and syntax extensions to let you manipulate HTML, CSS, XML, JSON and Markdown directly from OCaml code.
OCaml
105
star
12

ocaml-cstruct

Map OCaml arrays onto C-like structs
OCaml
103
star
13

awa-ssh

Purely functional SSH library in ocaml.
OCaml
103
star
14

ocaml-dns

OCaml implementation of the DNS protocol
OCaml
102
star
15

ocaml-github

GitHub APIv3 OCaml bindings
OCaml
99
star
16

ocaml-solo5

Freestanding OCaml runtime
C
98
star
17

capnp-rpc

Cap'n Proto RPC implementation
OCaml
95
star
18

ocaml-uri

RFC3986 URI parsing library for OCaml
OCaml
93
star
19

ocaml-rpc

Light library to deal with RPCs in OCaml
OCaml
93
star
20

digestif

Simple hash algorithms in OCaml
OCaml
85
star
21

ocaml-conduit

Dereference URIs into communication channels for Async or Lwt
OCaml
84
star
22

mirage-platform

Archived, see https://github.com/mirage/mirage/issues/1159 for details. Old: Core platform libraries for Mirage (UNIX and Xen). This provides the `OS` library which handles timers, device setup and the main loop, as well as the runtime for the Xen unikernel.
C
77
star
23

mirage-crypto

Cryptographic primitives for OCaml, in OCaml (also used in MirageOS)
C
73
star
24

xen

Unofficial mirror of xenbits.xen.org/xen.git
C
72
star
25

ocaml-crunch

Convert a filesystem into a static OCaml module
OCaml
70
star
26

mini-os

Mirror of the Xen MiniOS Git from git://xenbits.xen.org/mini-os.git
C
64
star
27

functoria

A DSL to invoke otherworldly functors
OCaml
63
star
28

ocaml-9p

An OCaml/Mirage-friendly implementation of the 9P protocol
OCaml
61
star
29

mirage-qubes

Mirage support for writing QubesOS AppVM unikernels
OCaml
60
star
30

xen-arm-builder

Archived - the Xen and ARM support in MirageOS has been superseeded by our PVH support - Build an SDcard image for Xen/ARM, for a Cubieboard
Shell
57
star
31

charrua

A DHCP library in OCaml
OCaml
55
star
32

orm

Object Relational Mapper extension
OCaml
54
star
33

eqaf

Constant time equal function to avoid timing attacks in OCaml
OCaml
50
star
34

ke

Fast implementation of queue in OCaml
HTML
49
star
35

ocaml-matrix

Implementation of a matrix server in OCaml for MirageOS
OCaml
49
star
36

ocaml-tar

Pure OCaml library to read and write tar files
OCaml
49
star
37

prometheus

OCaml library for reporting metrics to a Prometheus server
OCaml
48
star
38

ocaml-vchan

Pure OCaml implementation of the "vchan" shared-memory communication protocol
OCaml
46
star
39

conan

Like detective conan, find clue about the type of the file
OCaml
45
star
40

metrics

Infrastructure to collect metrics from OCaml applications.
OCaml
45
star
41

bechamel

Agnostic benchmark in OCaml (proof-of-concept)
OCaml
44
star
42

wodan

A Mirage filesystem library
OCaml
44
star
43

ocaml-base64

Base64 encoding and decoding in OCaml
OCaml
43
star
44

colombe

Implementation of SMTP protocols in OCaml
OCaml
42
star
45

ocaml-ipaddr

A library for manipulation of IP (and MAC) address representations
OCaml
41
star
46

mrmime

What do you mean?
OCaml
40
star
47

ezjsonm

An easy interface on top of the Jsonm library.
OCaml
40
star
48

index

A platform-agnostic multi-level index
OCaml
34
star
49

bloomf

Efficient Bloom filters for OCaml
OCaml
34
star
50

mirage-nat

library for network address translation intended for use with mirage unikernels
OCaml
31
star
51

emile

& images
OCaml
30
star
52

ocaml-hex

Hexadecimal converter
OCaml
29
star
53

ocaml-diet

A simple implementation of Discrete Interval Encoding Trees
OCaml
27
star
54

repr

OCaml
27
star
55

ptt

Postes, Télégraphes et Téléphones
OCaml
26
star
56

ocaml-fat

Read and write FAT format filesystems from OCaml
OCaml
26
star
57

encore

Synonym of angkor
OCaml
25
star
58

ocaml-magic-mime

Convert file extensions to MIME types
OCaml
24
star
59

irmin-server

A high-performance server for Irmin
OCaml
24
star
60

ocaml-lazy-trie

Lazy prefix trees in OCaml
OCaml
23
star
61

optint

Library to provide a fast integer (x64 arch) or allocated int32 (x84 arch)
OCaml
23
star
62

ocaml-pcap

OCaml code for generating and analysing pcap (packet capture) files
OCaml
22
star
63

qubes-mirage-skeleton

An example Mirage unikernel that runs as a Qubes AppVM
OCaml
22
star
64

duff

Pure OCaml implementation of libXdiff (Rabin's fingerprint)
OCaml
21
star
65

hacl

Archived. Curve25519 support has been integrated into mirage-crypto-ec (via fiat-crypto). Hacl bindings are available from the hacl-star opam package. OCaml bindings for HACL* elliptic curves
C
21
star
66

arp

Address resolution protocol (ARP) implementation in OCaml targeting MirageOS
OCaml
21
star
67

shared-memory-ring

Xen-style shared memory rings
OCaml
20
star
68

irmin-rpc

RPC client/server for Irmin
OCaml
20
star
69

typebeat

Parsing of the Content-Type header in pure OCaml
OCaml
20
star
70

ocaml-tuntap

Bindings to UNIX tuntap facilities
OCaml
20
star
71

mirage-lambda

An eDSL for MirageOS apps
OCaml
19
star
72

merge-queues

Mergeable queues
OCaml
19
star
73

mirage-solo5

Solo5 core platform libraries for MirageOS
OCaml
19
star
74

ocaml-qcow

Pure OCaml code for parsing, printing, modifying .qcow format data
OCaml
19
star
75

mirage-xen

Xen core platform libraries for MirageOS
C
18
star
76

mirage-profile

Collect profiling information
OCaml
18
star
77

ocaml-vmnet

NATed networking on MacOS X using the vmnet framework
OCaml
18
star
78

mirage-clock

Portable clock implementation for Unix and Xen
OCaml
18
star
79

ocaml-mbr

A simple library for manipulating Master Boot Records
OCaml
18
star
80

cactus

A Btree library in OCaml
OCaml
17
star
81

mirage-dev

Development OPAM repository for work-in-progress packages
16
star
82

mirage-fs-unix

Unix Filesystem passthrough for MirageOS
OCaml
16
star
83

mirage-vnetif

Virtual network interface and software bridge for Mirage
OCaml
16
star
84

spamtacus

Ocaml modular spam filter
OCaml
15
star
85

irmin-rs

Rust
15
star
86

checkseum

C
15
star
87

ocaml-hvsock

Bindings for hypervisor sockets, for Linux, Windows and macOS (via Hyperkit)
OCaml
14
star
88

mirage-handbook

WIP Handbook for MirageOS
14
star
89

ca-certs

Detect root CA certificates from the operating system
OCaml
14
star
90

irmin-watcher

Portable implementation of the Irmin Watch API
OCaml
14
star
91

retreat.mirage.io

Microsite for the MirageOS hack retreats
OCaml
14
star
92

mmap

File mapping
OCaml
13
star
93

mirage-decks

These are the MirageOS slide decks, written as a self-hosting unikernel
HTML
13
star
94

ezxmlm

Like the tax form, this is an easier interface for quick n dirty XMLM scripts
OCaml
13
star
95

mirage-unix

Unix core platform libraries for MirageOS
OCaml
13
star
96

ocaml-gpt

A simple library for manipulating GUID partition tables
OCaml
12
star
97

irmin.org

Irmin website
CSS
12
star
98

mirage-console

Portable console handling for Mirage applications
OCaml
12
star
99

mirage-net-xen

Xen Netfront and Netback ethernet device drivers for Mirage
OCaml
12
star
100

ocaml-openflow

OCaml
12
star