• Stars
    star
    800
  • Rank 56,950 (Top 2 %)
  • Language
    Erlang
  • License
    Other
  • Created almost 8 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Raft implementation for Erlang and Elixir that strives to be efficient and make it easier to use multiple Raft clusters in a single system.

A Raft Implementation for Erlang and Elixir

Ra is a Raft implementation by Team RabbitMQ. It is not tied to RabbitMQ and can be used in any Erlang or Elixir project. It is, however, heavily inspired by and geared towards RabbitMQ needs.

Ra (by virtue of being a Raft implementation) is a library that allows users to implement persistent, fault-tolerant and replicated state machines.

Project Maturity

This library has been extensively tested and is suitable for production use. This means the primary APIs (ra, ra_machine modules) and on disk formats will be backwards-compatible going forwards in line with Semantic Versioning. Care has been taken to version all on-disk data formats to enable frictionless future upgrades.

Status

The following Raft features are implemented:

  • Leader election
  • Log replication
  • Cluster membership changes: one server (member) at a time
  • Log compaction (with limitations and RabbitMQ-specific extensions)
  • Snapshot installation

Build Status

Actions

Safety Verification

Ra is continuously tested with the Jepsen distributed system verification framework.

Supported Erlang/OTP Versions

Ra supports the following Erlang/OTP versions:

  • 26.x
  • 25.x

Modern Erlang releases provide distribution traffic fragmentation which algorithms such as Raft significantly benefit from.

Design Goals

  • Low footprint: use as few resources as possible, avoid process tree explosion
  • Able to run thousands of ra clusters within an Erlang node
  • Provide adequate performance for use as a basis for a distributed data service

Use Cases

This library was primarily developed as the foundation of a replication layer for quorum queues in RabbitMQ, and today also powers RabbitMQ streams and Khepri.

The design it aims to replace uses a variant of Chain Based Replication which has two major shortcomings:

  • Replication algorithm is linear
  • Failure recovery procedure requires expensive topology changes

Smallest Possible Usage Example

The example below assumes a few things:

  • You are familiar with the basics of distributed Erlang
  • Three Erlang nodes are started on the local machine or reachable resolvable hosts. Their names are [email protected], [email protected], and [email protected] in the example below but your actual hostname will be different. Therefore the naming scheme is ra{N}@{hostname}. This is not a Ra requirement so you are welcome to use different node names and update the code accordingly.

Erlang nodes can be started using rebar3 shell --name {node name}. They will have Ra modules on code path:

# replace hostname.local with your actual hostname
rebar3 shell --name [email protected]
# replace hostname.local with your actual hostname
rebar3 shell --name [email protected]
# replace hostname.local with your actual hostname
rebar3 shell --name [email protected]

After Ra nodes form a cluster, state machine commands can be performed.

Here's what a small example looks like:

%% All servers in a Ra cluster are named processes on Erlang nodes.
%% The Erlang nodes must have distribution enabled and be able to
%% communicate with each other.
%% See https://learnyousomeerlang.com/distribunomicon if you are new to Erlang/OTP.

%% These Erlang nodes will host Ra nodes. They are the "seed" and assumed to
%% be running or come online shortly after Ra cluster formation is started with ra:start_cluster/4.
ErlangNodes = ['[email protected]', '[email protected]', '[email protected]'],

%% This will check for Erlang distribution connectivity. If Erlang nodes
%% cannot communicate with each other, Ra nodes would not be able to cluster or communicate
%% either.
[io:format("Attempting to communicate with node ~s, response: ~s~n", [N, net_adm:ping(N)]) || N <- ErlangNodes],

%% The Ra application has to be started on all nodes before it can be used.
[rpc:call(N, ra, start, []) || N <- ErlangNodes],

%% Create some Ra server IDs to pass to the configuration. These IDs will be
%% used to address Ra nodes in Ra API functions.
ServerIds = [{quick_start, N} || N <- ErlangNodes],

ClusterName = quick_start,
%% State machine that implements the logic and an initial state
Machine = {simple, fun erlang:'+'/2, 0},

%% Start a Ra cluster with an addition state machine that has an initial state of 0.
%% It's sufficient to invoke this function only on one Erlang node. For example, this
%% can be a "designated seed" node or the node that was first to start and did not discover
%% any peers after a few retries.
%%
%% Repeated startup attempts will fail even if the cluster is formed, has elected a leader
%% and is fully functional.
{ok, ServersStarted, _ServersNotStarted} = ra:start_cluster(default, ClusterName, Machine, ServerIds),

%% Add a number to the state machine.
%% Simple state machines always return the full state after each operation.
{ok, StateMachineResult, LeaderId} = ra:process_command(hd(ServersStarted), 5),

%% Use the leader id from the last command result for the next one
{ok, 12, LeaderId1} = ra:process_command(LeaderId, 7).

Querying Machine State

Ra machines are only useful if their state can be queried. There are two types of queries:

  • Local queries return machine state on the target node
  • Leader queries return machine state from the leader node. If a follower node is queried, the query command will be redirected to the leader.

Local queries are much more efficient but can return out-of-date machine state. Leader queries offer best possible machine state consistency but potentially require sending a request to a remote node.

Use ra:leader_query/{2,3} to perform a leader query:

%% find current Raft cluster leader
{ok, _Members, LeaderId} = ra:members(quick_start),
%% perform a leader query on the leader node
QueryFun = fun(StateVal) -> StateVal end,
{ok, {_TermMeta, State}, LeaderId1} = ra:leader_query(LeaderId, QueryFun).

Similarly, use ra:local_query/{2,3} to perform a local query:

%% this is the replica hosted on the current Erlang node.
%% alternatively it can be constructed as {ClusterName, node()}
{ok, Members, _LeaderId} = ra:members(quick_start),
LocalReplicaId = lists:keyfind(node(), 2, Members),
%% perform a local query on the local node
QueryFun = fun(StateVal) -> StateVal end,
{ok, {_TermMeta, State}, LeaderId1} = ra:local_query(LocalReplicaId, QueryFun).

A query function is a single argument function that accepts current machine state and returns any value (usually derived from the state).

Both ra:leader_query/2 and ra:local_query/2 return machine term metadata, a result returned by the query function, and current cluster leader ID.

Dynamically Changing Cluster Membership

Nodes can be added to or removed from a Ra cluster dynamically. Only one cluster membership change at a time is allowed: concurrent changes will be rejected by design.

In this example, instead of starting a "pre-formed" cluster, a local server is started and then members are added by calling ra:add_member/2.

Start 3 Erlang nodes:

# replace hostname.local with your actual hostname
rebar3 shell --name [email protected]
# replace hostname.local with your actual hostname
rebar3 shell --name [email protected]
# replace hostname.local with your actual hostname
rebar3 shell --name [email protected]

Start the ra application:

%% on [email protected]
ra:start().
% => ok
%% on [email protected]
ra:start().
% => ok
%% on [email protected]
ra:start().
% => ok

A single node cluster can be started from any node.

For the purpose of this example, [email protected] is used as the starting member:

ClusterName = dyn_members,
Machine = {simple, fun erlang:'+'/2, 0},

% Start a cluster
{ok, _, _} =  ra:start_cluster(default, ClusterName, Machine, [{dyn_members, '[email protected]'}]).

After the cluster is formed, members can be added.

Add [email protected] by telling [email protected] about it and starting a Ra replica (server) on [email protected] itself:

% Add member
{ok, _, _} = ra:add_member({dyn_members, '[email protected]'}, {dyn_members, '[email protected]'}),

% Start the server
ok = ra:start_server(default, ClusterName, {dyn_members, '[email protected]'}, Machine, [{dyn_members, '[email protected]'}]).

Add [email protected] to the cluster:

% Add a new member
{ok, _, _} = ra:add_member({dyn_members, '[email protected]'}, {dyn_members, '[email protected]'}),

% Start the server
ok = ra:start_server(default, ClusterName, {dyn_members, '[email protected]'}, Machine, [{dyn_members, '[email protected]'}]).

Check the members from any node:

ra:members({dyn_members, node()}).
% => {ok,[{dyn_members,'[email protected]'},
% =>      {dyn_members,'[email protected]'},
% =>      {dyn_members,'[email protected]'}],
% =>      {dyn_members,'[email protected]'}}

If a node wants to leave the cluster, it can use ra:leave_and_terminate/3 and specify itself as the target:

Temporarily add a new node, say [email protected], to the cluster:

% Add a new member
{ok, _, _} = ra:add_member({dyn_members, '[email protected]'}, {dyn_members, '[email protected]'}),

% Start the server
ok = ra:start_server(default, ClusterName, {dyn_members, '[email protected]'}, Machine, [{dyn_members, '[email protected]'}]).

%% on [email protected]
ra:leave_and_terminate(default, {ClusterName, node()}, {ClusterName, node()}).

ra:members({ClusterName, node()}).
% => {ok,[{dyn_members,'[email protected]'},
% =>      {dyn_members,'[email protected]'},
% =>      {dyn_members,'[email protected]'}],
% =>      {dyn_members,'[email protected]'}}

Other examples

See Ra state machine tutorial for how to write more sophisticated state machines by implementing the ra_machine behaviour.

A Ra-based key/value store example is available in a separate repository.

Documentation

Examples

Configuration Reference

Key Description Data Type
data_dir A directory name where Ra node will store its data Local directory path
wal_data_dir A directory name where Ra will store it's WAL (Write Ahead Log) data. If unspecified, `data_dir` is used. Local directory path
wal_max_size_bytes The maximum size of the WAL in bytes. Default: 512 MB Positive integer
wal_max_entries The maximum number of entries per WAL file. Default: undefined Positive integer
wal_compute_checksums Indicate whether the wal should compute and validate checksums. Default: `true` Boolean
wal_write_strategy
  • default: used by default. write(2) system calls are delayed until a buffer is due to be flushed. Then it writes all the data in a single call then fsyncs. Fastest option but incurs some additional memory use.
  • o_sync: Like default but will try to open the file with O_SYNC and thus won't need the additional fsync(2) system call. If it fails to open the file with this flag this mode falls back to default.
Enumeration: default | o_sync
wal_sync_method
  • datasync: used by default. Uses the fdatasync(2) system call after each batch. This avoids flushing file meta-data after each write batch and thus may be slightly faster than sync on some system. When datasync is configured the wal will try to pre-allocate the entire WAL file. Not all systems support fdatasync. Please consult system documentation and configure it to use sync instead if it is not supported.
  • sync: uses the fsync system call after each batch.
Enumeration: datasync | sync
logger_module Allows the configuration of a custom logger module. The default is logger. The module must implement a function of the same signature as logger:log/4 (the variant that takes a format not the variant that takes a function). Atom
wal_max_batch_size Controls the internal max batch size that the WAL will accept. Higher numbers may result in higher memory use. Default: 32768. Positive integer
wal_hibernate_after Enables hibernation after a timeout of inactivity for the WAL process. Milliseconds
metrics_key Metrics key. The key used to write metrics into the ra_metrics table. Atom
low_priority_commands_flush_size When commands are pipelined using the low priority mode Ra tries to hold them back in favour of normal priority commands. This setting determines the number of low priority commands that are added to the log each flush cycle. Default: 25 Positive integer
segment_compute_checksums Indicate whether the segment writer should compute and validate checksums. Default: `true` Boolean

Logging

Ra will use default OTP logger by default, unless logger_module configuration key is used to override.

To change log level to debug for all applications, use

logger:set_primary_config(level, debug).

Ra versioning

Ra attempts to follow Semantic Versioning.

The modules that form part of the public API are:

  • ra
  • ra_machine (behaviour callbacks only)
  • ra_aux
  • ra_system
  • ra_counters
  • ra_leaderboard
  • ra_env
  • ra_directory

Copyright and License

(c) 2017-2024 Broadcom. All Rights Reserved. The term "Broadcom" refers to Broadcom Inc. and/or its subsidiaries.

Dual licensed under the Apache License Version 2.0 and Mozilla Public License Version 2.0.

This means that the user can consider the library to be licensed under any of the licenses from the list above. For example, you may choose the Apache Public License 2.0 and include this library into a commercial product.

See LICENSE for details.

More Repositories

1

rabbitmq-server

Open source RabbitMQ: core server and tier 1 (built-in) plugins
Starlark
11,952
star
2

rabbitmq-tutorials

Tutorials for using RabbitMQ in various ways
Java
6,393
star
3

rabbitmq-dotnet-client

RabbitMQ .NET client for .NET Standard 2.0+ and .NET 4.6.2+
C#
2,056
star
4

rabbitmq-delayed-message-exchange

Delayed Messaging for RabbitMQ
Erlang
1,992
star
5

internals

High level architecture overview
1,444
star
6

amqp091-go

An AMQP 0-9-1 Go client maintained by the RabbitMQ team. Originally by @streadway: `streadway/amqp`
Go
1,319
star
7

rabbitmq-java-client

RabbitMQ Java client
Java
1,227
star
8

cluster-operator

RabbitMQ Cluster Kubernetes Operator
Go
865
star
9

rabbitmq-website

RabbitMQ website
JavaScript
769
star
10

erlang-rpm

Latest Erlang/OTP releases packaged as a zero dependency RPM, just enough for running RabbitMQ
Shell
540
star
11

rabbitmq-management

RabbitMQ Management UI and HTTP API
Erlang
369
star
12

tls-gen

Generates self-signed x509/TLS/SSL certificates useful for development
Python
362
star
13

rabbitmq-perf-test

A load testing tool
Java
355
star
14

khepri

Khepri is a tree-like replicated on-disk database library for Erlang and Elixir.
Erlang
317
star
15

erlando

Erlando
Erlang
306
star
16

rabbitmq-sharding

Sharded logical queues for RabbitMQ: a queue type which provides improved parallelism and thoughput at the cost of total ordering
Erlang
303
star
17

rabbitmq-peer-discovery-k8s

Kubernetes-based peer discovery mechanism for RabbitMQ
Erlang
296
star
18

rabbitmq-objc-client

RabbitMQ client for Objective-C and Swift
Objective-C
241
star
19

chef-cookbook

Development repository for Chef cookbook RabbitMQ
Ruby
211
star
20

rmq-0mq

ZeroMQ support in RabbitMQ
Erlang
210
star
21

rabbitmq-consistent-hash-exchange

RabbitMQ Consistent Hash Exchange Type
Erlang
209
star
22

rabbitmq-auth-backend-http

HTTP-based authorisation and authentication for RabbitMQ
Makefile
199
star
23

rabbitmq-erlang-client

Erlang client for RabbitMQ
Erlang
185
star
24

rabbitmq-mqtt

RabbitMQ MQTT plugin
Erlang
173
star
25

rabbitmq-stream-go-client

A client library for RabbitMQ streams
Go
167
star
26

rabbitmq-stream-rust-client

A client library for RabbitMQ streams
Rust
148
star
27

rabbitmq-prometheus

A minimalistic Prometheus exporter of core RabbitMQ metrics
Erlang
145
star
28

looking_glass

An Erlang/Elixir/BEAM profiler tool
Erlang
139
star
29

hop

RabbitMQ HTTP API client for Java, Groovy, and other JVM languages
Java
137
star
30

messaging-topology-operator

RabbitMQ messaging topology operator
Go
123
star
31

rabbitmq-cli

Command line tools for RabbitMQ
Elixir
105
star
32

rabbitmq-stream-dotnet-client

RabbitMQ client for the stream protocol
C#
100
star
33

rabbitmq-web-stomp-examples

Makefile
94
star
34

rabbitmq-amqp1.0

AMQP 1.0 support for RabbitMQ
Erlang
93
star
35

rabbitmq-web-stomp

Provides support for STOMP over WebSockets
Erlang
89
star
36

rabbitmq-recent-history-exchange

RabbitMQ Recent History Exchange
Makefile
82
star
37

diy-kubernetes-examples

Examples that demonstrate how deploy a RabbitMQ cluster to Kubernetes, the DIY way
Makefile
82
star
38

rabbitmq-event-exchange

Expose broker events as messages
Erlang
78
star
39

rabbitmq-clusterer

This project is ABANDONWARE. Use https://www.rabbitmq.com/cluster-formation.html instead.
Erlang
72
star
40

rabbitmq-message-timestamp

A RabbitMQ plugin that adds a timestamp to all incoming messages
Makefile
72
star
41

rabbitmq-common

Common library used by rabbitmq-server and rabbitmq-erlang-client
Erlang
66
star
42

rabbitmq-c

The official rabbitmq-c sources have moved to:
C
65
star
43

tgir

Official repository for Thank Goodness It's RabbitMQ (TGIR)!
Makefile
65
star
44

rabbitmq-jms-client

RabbitMQ JMS client
Java
61
star
45

rabbit-socks

Websocket and Socket.IO support for RabbitMQ (deprecated -- see https://github.com/sockjs/sockjs-erlang instead)
Erlang
58
star
46

rabbitmq-web-mqtt

Provides support for MQTT over WebSockets
Erlang
55
star
47

rabbitmq-stream-java-client

RabbitMQ Stream Java Client
Java
55
star
48

rabbitmq-top

Adds top-like information on the Erlang VM to the management plugin.
Makefile
55
star
49

rabbitmq-shovel

RabbitMQ Shovel plugin
Erlang
53
star
50

rabbitmq-auth-mechanism-ssl

RabbitMQ TLS (x509 certificate) authentication mechanism
Makefile
52
star
51

aten

An adaptive accrual node failure detection library for Elixir and Erlang
Erlang
50
star
52

rabbitmq-stomp

RabbitMQ STOMP plugin
Erlang
49
star
53

rabbitmq-tracing

RabbitMQ Tracing
Erlang
48
star
54

mnevis

Raft-based, consensus oriented implementation of Mnesia transactions
Erlang
48
star
55

gen-batch-server

A generic batching server for Erlang and Elixir
Erlang
47
star
56

rabbitmq-priority-queue

Priority Queues
46
star
57

osiris

Log based streaming subsystem for RabbitMQ
Erlang
45
star
58

rabbitmq-management-visualiser

RabbitMQ Topology Visualiser
JavaScript
41
star
59

rabbitmq-oauth2-tutorial

Explore integration of RabbitMQ with Oauth 2.0 auth backend plugin
Shell
41
star
60

rabbitmq-peer-discovery-consul

Consul-based peer discovery backend for RabbitMQ 3.7.0+
Erlang
40
star
61

rabbitmq-federation

RabbitMQ Federation plugin
Erlang
39
star
62

rabbitmq-auth-backend-oauth2

RabbitMQ authorization backend that uses OAuth 2.0 (JWT) tokens
Erlang
38
star
63

rabbitmq-codegen

RabbitMQ protocol code-generation and machine-readable spec
Python
37
star
64

support-tools

A staging area for various support and troubleshooting tools that are not (or not yet) included into RabbitMQ distribution
Shell
36
star
65

ra-kv-store

Raft-based key-value store
Clojure
33
star
66

rabbitmq-perf-html

Web page to view performance results
JavaScript
33
star
67

rabbitmq-web-mqtt-examples

Examples for the Web MQTT plugin
JavaScript
32
star
68

rabbitmq-public-umbrella

Work with ease on multiple RabbitMQ sub-projects, e.g. core broker, plugins and some client libraries
Makefile
32
star
69

rules_erlang

Bazel rules for building Erlang applications and libraries
Starlark
32
star
70

rabbitmq-smtp

RabbitMQ SMTP gateway
Erlang
31
star
71

rabbitmq-metronome

RabbitMQ example plugin
Makefile
27
star
72

rabbitmq-rtopic-exchange

RabbitMQ Reverse Topic Exchange
Erlang
27
star
73

horus

Erlang library to create standalone modules from anonymous functions
Erlang
25
star
74

workloads

Continuous validation of RabbitMQ workloads
JavaScript
24
star
75

rabbitmq-tracer

AMQP 0-9-1 protocol analyzer
Java
24
star
76

rabbitmq-peer-discovery-aws

AWS-based peer discovery backend for RabbitMQ 3.7.0+
Erlang
24
star
77

rabbitmq-shovel-management

RabbitMQ Shovel Management
Makefile
23
star
78

rabbitmq-auth-backend-ldap

RabbitMQ LDAP authentication
Erlang
22
star
79

rabbitmq-management-themes

Makefile
22
star
80

rabbitmq-auth-backend-amqp

Authentication over AMQP RPC
Erlang
20
star
81

rabbitmq-amqp1.0-client

Erlang AMQP 1.0 client
Erlang
20
star
82

erlang-data-structures

Erlang Data Structures
Erlang
20
star
83

chocolatey-package

RabbitMQ chocolatey package
PowerShell
19
star
84

lz4-erlang

LZ4 compression library for Erlang.
C
19
star
85

rabbitmq-service-nodejs-sample

A simple node.js sample app for the RabbitMQ service/add-on
JavaScript
18
star
86

rabbitmq-auth-backend-oauth2-spike

See rabbitmq/rabbitmq-auth-backend-oauth2 instead.
Erlang
17
star
87

rabbitmq-msg-store-index-eleveldb

LevelDB-based message store index for RabbitMQ
Erlang
17
star
88

rabbitmq-management-agent

RabbitMQ Management Agent
Erlang
17
star
89

rabbitmq-auth-backend-cache

Authorisation result caching plugin (backend) for RabbitMQ
Erlang
17
star
90

rabbitmq-jsonrpc-channel

RabbitMQ JSON-RPC Channels
JavaScript
15
star
91

rabbitmq-ha

Highly available queues for RabbitMQ
Erlang
15
star
92

rabbitmq-jsonrpc

RabbitMQ JSON-RPC Integration
Makefile
15
star
93

stdout_formatter

Erlang library to format paragraphs, lists and tables as plain text
Erlang
15
star
94

rabbitmq-peer-discovery-etcd

etcd-based peer discovery backend for RabbitMQ 3.7.0+
Erlang
15
star
95

rabbitmq-federation-management

RabbitMQ Federation Management
Makefile
14
star
96

credentials-obfuscation

Tiny library/OTP app for credential obfuscation
Erlang
14
star
97

rabbitmq-server-release

RabbitMQ packaging and release engineering bits that do not belong to the Concourse pipelines.
Shell
13
star
98

seshat

Erlang
13
star
99

rabbitmq-jms-topic-exchange

Custom exchange that implements JMS topic selection for RabbitMQ
Erlang
13
star
100

rabbitmq-store-exporter

RabbitMQ Store Exporter
Erlang
12
star