• Stars
    star
    122
  • Rank 283,051 (Top 6 %)
  • Language
    Erlang
  • License
    Apache License 2.0
  • Created about 9 years ago
  • Updated about 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Machi file store

Machi: a distributed, decentralized blob/large file store

Travis-CI :: Travis-CI

Outline

  1. Why another blob/file store?
  2. Where to learn more about Machi
  3. Development status summary
  4. Contributing to Machi's development
## 1. Why another blob/file store?

Our goal is a robust & reliable, distributed, highly available, large file and blob store. Such stores already exist, both in the open source world and in the commercial world. Why reinvent the wheel? We believe there are three reasons, ordered by decreasing rarity.

  1. We want end-to-end checksums for all file data, from the initial file writer to every file reader, anywhere, all the time.
  2. We need flexibility to trade consistency for availability: e.g. weak consistency in exchange for being available in cases of partial system failure.
  3. We want to manage file replicas in a way that's provably correct and also easy to test.

Criteria #3 is difficult to find in the open source world but perhaps not impossible.

If we have app use cases where availability is more important than consistency, then systems that meet criteria #2 are also rare. Most file stores provide only strong consistency and therefore have unavoidable, unavailable behavior when parts of the system fail. What if we want a file store that is always available to write new file data and attempts best-effort file reads?

If we really do care about data loss and/or data corruption, then we really want both #3 and #1. Unfortunately, systems that meet criteria #1 are very rare. (Nonexistant?) Why? This is 2015. We have decades of research that shows that computer hardware can (and indeed does) corrupt data at nearly every level of the modern client/server application stack. Systems with end-to-end data corruption detection should be ubiquitous today. Alas, they are not.

Machi is an effort to change the deplorable state of the world, one Erlang function at a time.

## 2. Where to learn more about Machi

The two major design documents for Machi are now mostly stable. Please see the doc directory's README for details.

We also have a Frequently Asked Questions (FAQ) list.

Scott recently (November 2015) gave a presentation at the RICON 2015 conference about one of the techniques used by Machi; "Managing Chain Replication Metadata with Humming Consensus" is available online now.

See later in this document for how to run the Humming Consensus demos, including the network partition simulator.

## 3. Development status summary

Mid-March 2016: The Machi development team has been downsized in recent months, and the pace of development has slowed. Here is a summary of the status of Machi's major components.

  • Humming Consensus and the chain manager

    • No new safety bugs have been found by model-checking tests.
    • A new document, Hands-on experiments with Machi and Humming Consensus is now available. It is a tutorial for setting up a 3 virtual machine Machi cluster and how to demonstrate the chain manager's reactions to server stops & starts, crashes & restarts, and pauses (simulated by SIGSTOP and SIGCONT).
    • The chain manager can still make suboptimal-but-safe choices for chain transitions when a server hangs/pauses temporarily.
      • Recent chain manager changes have made the instability window much shorter when the slow/paused server resumes execution.
      • Scott believes that a modest change to the chain manager's calculation of a new projection can reduce flapping in this (and many other cases) less likely. Currently, the new local projection is calculated using only local state (i.e., the chain manager's internal state + the fitness server's state). However, if the "latest" projection read from the public projection stores were also input to the new projection calculation function, then many obviously bad projections can be avoided without needing rounds of Humming Consensus to demonstrate that a bad projection is bad.
  • FLU/data server process

    • All known correctness bugs have been fixed.
    • Performance has not yet been measured. Performance measurement and enhancements are scheduled to start in the middle of March 2016. (This will include a much-needed update to the basho_bench driver.)
  • Access protocols and client libraries

    • The protocol used by both external clients and internally (instead of using Erlang's native message passing mechanisms) is based on Protocol Buffers.
      • (Machi PB protocol specification: ./src/machi.proto)[./src/machi.proto]
      • At the moment, the PB specification contains two protocols. Sometime in the near future, the spec will be split to separate the external client API (the "high" protocol) from the internal communication API (the "low" protocol).
  • Recent conference talks about Machi

## 4. Contributing to Machi's development

4.1 License

Basho Technologies, Inc. as committed to licensing all work for Machi under the Apache Public License version 2. All authors of source code and documentation who agree with these licensing terms are welcome to contribute their ideas in any form: suggested design or features, documentation, and source code.

Machi is still a very young project within Basho, with a small team of developers; please bear with us as we grow out of "toddler" stage into a more mature open source software project. We invite all contributors to review the CONTRIBUTING.md document for guidelines for working with the Basho development team.

4.2 Development environment requirements

All development to date has been done with Erlang/OTP version 17 on OS X. The only known limitations for using R16 are minor type specification difference between R16 and 17, but we strongly suggest continuing development using version 17.

We also assume that you have the standard UNIX/Linux developer tool chain for C and C++ applications. Also, we assume that Git and GNU Make are available. The utility used to compile the Machi source code, rebar, is pre-compiled and included in the repo. For more details, please see the Machi development environment prerequisites doc.

Machi has a dependency on the ELevelDB library. ELevelDB only supports UNIX/Linux OSes and 64-bit versions of Erlang/OTP only; we apologize to Windows-based and 32-bit-based Erlang developers for this restriction.

4.3 New protocols and features

If you'd like to work on a protocol such as Thrift, UBF, msgpack over UDP, or some other protocol, let us know by opening an issue to discuss it.

More Repositories

1

riak

Riak is a decentralized datastore from Basho Technologies.
Shell
3,841
star
2

riak_core

Distributed systems infrastructure used by Riak.
Erlang
1,185
star
3

bitcask

because you need another a key/value storage engine
Erlang
1,179
star
4

rebar

ATTENTION: Please find the canonical repository here:
Erlang
1,070
star
5

riak_kv

Riak Key/Value Store
Erlang
633
star
6

riak_cs

Riak CS is simple, available cloud storage built on Riak.
Erlang
564
star
7

leveldb

Clone of http://code.google.com/p/leveldb/
C++
408
star
8

erlang_protobuffs

An implementation of Google's Protocol Buffers for Erlang, based on ngerakines/erlang_protobuffs.
Erlang
390
star
9

riak_dt

Convergent replicated datatypes in Erlang
Erlang
346
star
10

riak-python-client

The Riak client for Python.
Python
324
star
11

riak-erlang-client

The Riak client for Erlang.
Erlang
312
star
12

basho_bench

A load-generation and testing tool for basically whatever you can write a returning Erlang function for.
Erlang
310
star
13

eleveldb

Erlang LevelDB API
C++
266
star
14

riak-java-client

The Riak client for Java.
Java
264
star
15

yokozuna

Riak + Solr
Erlang
246
star
16

erlang_js

A linked-in driver for Erlang to Mozilla's Spidermonkey Javascript runtime.
Erlang
238
star
17

riak-ruby-client

The Riak client for Ruby.
Ruby
232
star
18

cuttlefish

never lose your childlike sense of wonder baby cuttlefish, promise me?
Erlang
205
star
19

basho_docs

Basho Products Documentation
SCSS
169
star
20

riak_ensemble

Multi-Paxos framework in Erlang
Erlang
166
star
21

riak-php-client

PHP clients for Riak
PHP
163
star
22

riak_pipe

Riak Pipelines
Erlang
162
star
23

clique

CLI Framework for Erlang
Erlang
146
star
24

riak_search

Full-text search engine based on Riak
Erlang
141
star
25

riak_control

Webmachine-based administration interface for Riak.
CSS
136
star
26

enm

Erlang driver for nanomsg
Erlang
120
star
27

sidejob

Parallel worker and capacity limiting library for Erlang
Erlang
104
star
28

riak-go-client

The Riak client for Go.
Go
91
star
29

node_package

RPM/Debian/FreeBSD/SmartOS/Solaris/OSX packaging templates for Erlang Nodes
Shell
90
star
30

merge_index

MergeIndex is an Erlang library for storing ordered sets on disk. It is very similar to an SSTable (in Google's Bigtable) or an HFile (in Hadoop).
Erlang
81
star
31

riak_function_contrib

Riak Function Contrib
Erlang
79
star
32

mochiweb

a branch of Mochi Media's excellent HTTP library -- their canonical source can be found at https://github.com/mochi/mochiweb
Erlang
78
star
33

riak-dotnet-client

The Riak client for .NET
C#
76
star
34

riak_sysmon

Simple OTP app for managing Erlang VM system_monitor event messages
Erlang
74
star
35

riak-nodejs-client

The Riak client for Node.js.
JavaScript
71
star
36

riak_pb

Riak Protocol Buffers Messages
Erlang
70
star
37

riak_test

I'm in your cluster, testing your riaks
Erlang
70
star
38

webmachine

A REST-based system for building web applications.
Erlang
64
star
39

cluster_info

Fork of Hibari's nifty cluster_info OTP app
Erlang
63
star
40

ebloom

A NIF wrapper around a basic bloom filter.
C++
62
star
41

spark-riak-connector

The official Riak Spark Connector for Apache Spark with Riak TS and Riak KV
Scala
60
star
42

innertube

A thread-safe re-entrant resource pool for Ruby, extracted from the Riak Ruby Client.
Ruby
59
star
43

lager_syslog

Syslog backend for lager
Erlang
59
star
44

riaknostic

A diagnostic tool for Riak installations, to find common errors asap
Erlang
58
star
45

riaktant

A node.js sample app that stores syslog messages in Riak Search
JavaScript
56
star
46

riak_repl

Riak DC Replication
Erlang
55
star
47

giddyup

Visual scorecard for riak_test.
JavaScript
54
star
48

riak-erlang-http-client

Riak Erlang client using the HTTP interface
Erlang
48
star
49

riak_crdt_cookbook

A Cookbook full of Tutorials to get Developers started with Riak's CRDTs
Erlang
43
star
50

riak_err

Enhanced SASL Error Logger for Riak
Erlang
36
star
51

riak-hadoop

Riak data as input to hadoop m/r and output of hadoop m/r
Java
33
star
52

innostore

Innostore is a simple Erlang API to Embedded InnoDB.
C
31
star
53

cloudformation-riak

Create Riak clusters with AWS CloudFormation
31
star
54

riak_ql

SQL query language for Riak
Erlang
30
star
55

riak_api

Riak Client APIs
Erlang
27
star
56

taste-of-riak

Source code for all the taste of riak examples
C#
22
star
57

luke

Dataflow / MapReduce coordination framework.
Erlang
22
star
58

basho_metrics

Fast performance metrics for Erlang
C++
21
star
59

skerl

Skein hash function for Erlang, via NIFs
C
19
star
60

stanchion

Stanchion is an application to enforce the serialization of requests for Riak CS.
Erlang
16
star
61

dactyl

String templating library for Erlang
Erlang
16
star
62

jam

Erlang time/date processing
Erlang
16
star
63

faulterl

Erlang glue & control code for dynamic library-level fault injection
C++
15
star
64

tools.mk

A small makefile library for working with erlang tools
Makefile
14
star
65

nifwait

Utility to test effect of blocking NIFs on Erlang scheduler
Erlang
14
star
66

rebar_raw_resource

A rebar3 resource wrapper to accommodate non-OTP-app dependencies
Erlang
14
star
67

columbo

Columbo - the dependency detective - will highlight 3rd party dependency problems in your Erlang programs.
Erlang
14
star
68

canola

Simple PAM port driver for erlang
C
12
star
69

bashubot

Our own little hubot.
CoffeeScript
11
star
70

riak_cs_control

Webmachine-based administration interface for Riak CS.
Erlang
9
star
71

bench_shim

A Jinterface shim between basho_bench and riak-java-client
Java
9
star
72

riak_on_azure

Shell
8
star
73

basho-dtrace

DTrace scripts that Basho Team has found useful
D
8
star
74

riak-client-tools

Tools for Riak Client libraries
Shell
8
star
75

riak-nodejs-client-examples

Riak Node.js Client example code
JavaScript
8
star
76

stableboy

basho_harness VM / Harness provisioner
Erlang
7
star
77

riak_shell

A Repl for Riak
Erlang
7
star
78

riak-phppb-client

Official PHP Protocol Buffers Client for Riak
PHP
7
star
79

bashobot

A friendly little bot for the #riak IRC channel
Ruby
7
star
80

riak-zabbix

A set of tools and templates to help monitor Riak with the Zabbix monitoring tool.
ApacheConf
7
star
81

riak_cs_auth

Authentication schemes for Riak CS
Erlang
6
star
82

systest

Miscellaneous system test scripts
Erlang
6
star
83

casbench

tools for benchmarking cassandra
Erlang
6
star
84

recap-blog

[depreciated] The Riak Recap Blog
6
star
85

zdgrab

Zdgrab is a utility for downloading attachments to tickets from Zendesk.
Python
5
star
86

planet_riak

Riak Planet
Python
5
star
87

riak_cs_core

Core functionality for RIak CS
Erlang
5
star
88

data_platform

Basho Data Platform
Shell
5
star
89

data_platform_core

Basho Data Platform Core
Erlang
5
star
90

riak_auth_mods

Standard interface for security auth modules for Riak
Erlang
5
star
91

riak_cs_web

HTTP APIs for Riak Cloud Storage
Erlang
4
star
92

rfc

JavaScript
4
star
93

cluster_info_browser

In-browser app to easily flip through a cluster_info report
JavaScript
4
star
94

riak_ee-issues

Issue tracking for Riak Enterprise
3
star
95

riak_cs_lfs

Large file support for Riak CS
Erlang
3
star
96

ripple-encryption

Ruby
3
star
97

congruent

Generic test suite for Riak clients
Erlang
3
star
98

riak_cs_multibag

Riak CS Multi-cluster module
Erlang
3
star
99

riak_cs_acl

ACL support for Riak CS
Erlang
3
star
100

riak_cs_report

Reporting functionality for Riak CS
Erlang
3
star