• Stars
    star
    1,401
  • Rank 33,554 (Top 0.7 %)
  • Language
    Go
  • License
    MIT License
  • Created over 12 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A disk-backed key-value store.

What is diskv?

Diskv (disk-vee) is a simple, persistent key-value store written in the Go language. It starts with an incredibly simple API for storing arbitrary data on a filesystem by key, and builds several layers of performance-enhancing abstraction on top. The end result is a conceptually simple, but highly performant, disk-backed storage system.

Build Status

Installing

Install Go 1, either from source or with a prepackaged binary. Then,

$ go get github.com/peterbourgon/diskv/v3

Usage

package main

import (
	"fmt"
	"github.com/peterbourgon/diskv/v3"
)

func main() {
	// Simplest transform function: put all the data files into the base dir.
	flatTransform := func(s string) []string { return []string{} }

	// Initialize a new diskv store, rooted at "my-data-dir", with a 1MB cache.
	d := diskv.New(diskv.Options{
		BasePath:     "my-data-dir",
		Transform:    flatTransform,
		CacheSizeMax: 1024 * 1024,
	})

	// Write three bytes to the key "alpha".
	key := "alpha"
	d.Write(key, []byte{'1', '2', '3'})

	// Read the value back out of the store.
	value, _ := d.Read(key)
	fmt.Printf("%v\n", value)

	// Erase the key+value from the store (and the disk).
	d.Erase(key)
}

More complex examples can be found in the "examples" subdirectory.

Theory

Basic idea

At its core, diskv is a map of a key (string) to arbitrary data ([]byte). The data is written to a single file on disk, with the same name as the key. The key determines where that file will be stored, via a user-provided TransformFunc, which takes a key and returns a slice ([]string) corresponding to a path list where the key file will be stored. The simplest TransformFunc,

func SimpleTransform (key string) []string {
    return []string{}
}

will place all keys in the same, base directory. The design is inspired by Redis diskstore; a TransformFunc which emulates the default diskstore behavior is available in the content-addressable-storage example.

Note that your TransformFunc should ensure that one valid key doesn't transform to a subset of another valid key. That is, it shouldn't be possible to construct valid keys that resolve to directory names. As a concrete example, if your TransformFunc splits on every 3 characters, then

d.Write("abcabc", val) // OK: written to <base>/abc/abc/abcabc
d.Write("abc", val)    // Error: attempted write to <base>/abc/abc, but it's a directory

This will be addressed in an upcoming version of diskv.

Probably the most important design principle behind diskv is that your data is always flatly available on the disk. diskv will never do anything that would prevent you from accessing, copying, backing up, or otherwise interacting with your data via common UNIX commandline tools.

Advanced path transformation

If you need more control over the file name written to disk or if you want to support slashes in your key name or special characters in the keys, you can use the AdvancedTransform property. You must supply a function that returns a special PathKey structure, which is a breakdown of a path and a file name. Strings returned must be clean of any slashes or special characters:

func AdvancedTransformExample(key string) *diskv.PathKey {
	path := strings.Split(key, "/")
	last := len(path) - 1
	return &diskv.PathKey{
		Path:     path[:last],
		FileName: path[last] + ".txt",
	}
}

// If you provide an AdvancedTransform, you must also provide its
// inverse:

func InverseTransformExample(pathKey *diskv.PathKey) (key string) {
	txt := pathKey.FileName[len(pathKey.FileName)-4:]
	if txt != ".txt" {
		panic("Invalid file found in storage folder!")
	}
	return strings.Join(pathKey.Path, "/") + pathKey.FileName[:len(pathKey.FileName)-4]
}

func main() {
	d := diskv.New(diskv.Options{
		BasePath:          "my-data-dir",
		AdvancedTransform: AdvancedTransformExample,
		InverseTransform:  InverseTransformExample,
		CacheSizeMax:      1024 * 1024,
	})
	// Write some text to the key "alpha/beta/gamma".
	key := "alpha/beta/gamma"
	d.WriteString(key, "Β‘Hola!") // will be stored in "<basedir>/alpha/beta/gamma.txt"
	fmt.Println(d.ReadString("alpha/beta/gamma"))
}

Adding a cache

An in-memory caching layer is provided by combining the BasicStore functionality with a simple map structure, and keeping it up-to-date as appropriate. Since the map structure in Go is not threadsafe, it's combined with a RWMutex to provide safe concurrent access.

Adding order

diskv is a key-value store and therefore inherently unordered. An ordering system can be injected into the store by passing something which satisfies the diskv.Index interface. (A default implementation, using Google's btree package, is provided.) Basically, diskv keeps an ordered (by a user-provided Less function) index of the keys, which can be queried.

Adding compression

Something which implements the diskv.Compression interface may be passed during store creation, so that all Writes and Reads are filtered through a compression/decompression pipeline. Several default implementations, using stdlib compression algorithms, are provided. Note that data is cached compressed; the cost of decompression is borne with each Read.

Streaming

diskv also now provides ReadStream and WriteStream methods, to allow very large data to be handled efficiently.

Future plans

  • Needs plenty of robust testing: huge datasets, etc...
  • More thorough benchmarking
  • Your suggestions for use-cases I haven't thought of

Credits and contributions

Original idea, design and implementation: Peter Bourgon Other collaborations: Javier Peletier (Epic Labs)

More Repositories

1

ff

Flags-first package for configuration
Go
1,366
star
2

go-microservices

Go microservices workshop example project
Go
360
star
3

caspaxos

A Go implementation of the CASPaxos protocol
Go
293
star
4

raft

An implementation of the Raft distributed consensus protocol.
Go
172
star
5

g2s

Get to Statsd: forward simple statistics to a statsd server
Go
149
star
6

how-i-start-go

How I Start: Go
Go
143
star
7

trc

In-process request tracing
Go
93
star
8

mergemap

Go library to recursively merge JSON maps
Go
93
star
9

unixtransport

Support for Unix domain sockets in Go HTTP clients
Go
79
star
10

ctxdata

A helper for collecting and emitting metadata throughout a request lifecycle.
Go
75
star
11

g2g

Get to Graphite: publish Go expvars to a Graphite server
Go
72
star
12

grender

A different take on a static site generator
Go
59
star
13

goop

An audio synthesizer in Go
Go
57
star
14

prometheus-aggregator

Prometheus metrics aggregator
Go
42
star
15

runsvinit

A Docker init process for graceful shutdown of runit services.
Go
39
star
16

gattaca

A monolith and microservices example repository
Go
39
star
17

go-training

Exercises
Go
38
star
18

conntrack

Track incoming and outgoing connections
Go
32
star
19

elasticsearch

ElasticSearch client library for Go
Go
30
star
20

infrastructure

My servers, let me show you them
Shell
26
star
21

sshttp

SSH/HTTP demuxing and proxying server
Go
25
star
22

ps

Publish/subscribe utility
Go
23
star
23

lightctl

Control program for IKEA TRΓ…DFRI smart lights
Go
19
star
24

wtf

Demonstrates a serious problem in the implementation of Go vendoring
Go
19
star
25

breakfast-solutions

A dumb service to illustrate principles of observability
Go
19
star
26

sympatico

Deprecated, please refer to peterbourgon/gattaca
17
star
27

dotfiles

πŸŒ€
Shell
15
star
28

ctxlog

Create wide log events in Go programs
Go
15
star
29

ffcli

A minimal package for building flags-first command line interfaces
12
star
30

http-proxy

An HTTP server with a simple config format that proxies to localhost
Go
11
star
31

stats

Compute normal stats on numbers from stdin.
Go
11
star
32

lpg

Local Prometheus and Grafana: work with metrics during development
Makefile
9
star
33

reference

Go kit reference service
Go
8
star
34

srvproxy

Proxy for DNS SRV records
Go
7
star
35

peter-bourgon-org

My web site.
HTML
4
star
36

numberstation

Emit data for websocket listeners
Go
4
star
37

hose-poet

Consume from the firehose; unearth accidental prose
C
4
star
38

tns

Dockertime
Go
3
star
39

crc32

Generate crc32 checksums of data
Go
2
star
40

usage

Nicer help text for Go programs
Go
2
star
41

grid

Controller software for the Monome Grid 128
Go
2
star
42

squawkbox

Squawk!
Go
1
star
43

sums

Does sums
Go
1
star
44

moduledemo

blah
Go
1
star
45

dummyrelease

Go
1
star
46

thirtyfour.org

Hypertext products
Go
1
star