• Stars
    star
    335
  • Rank 121,351 (Top 3 %)
  • Language
    Erlang
  • License
    Other
  • Created over 11 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Distributed Pubsub Server for Realtime Apps

Kraken

Overview

Kraken is a distributed pubsub server that is designed to power collaborative realtime apps like Asana.

Applications use Kraken to transmit and receive messages through topics. These messages will typically contain just enough information to identify the set of data that was changed by the client just before the message was published. When other clients receive these messages, they will figure out which data changed and reload it from the datastore so that they are eventually brought up to date.

Kraken is not a general purpose message bus like RabbitMQ.

Building Kraken for the first time

We recommend running Kraken with Erlang R15B03 and above, but it will likely work fine with older versions of Erlang as well.

Download the latest Kraken release (or clone the git project), and then use the following command from the root of the Kraken directory to build Kraken for the first time:

./rebar get-deps compile

You can now run Kraken in the foreground like this:

bin/kraken run

This will start Kraken up with the default config. It will listen to new TCP connections on port 12355.

Running Kraken

Kraken comes with a little bash script that provides a handful of useful commands:

Running Kraken in the foreground, with a live erlang shell

bin/kraken run

Starting and Stopping Kraken in the background

bin/kraken start
bin/kraken stop

Checking if Kraken is currently running in the background

bin/kraken status

Changing the Kraken log level while it is running

bin/kraken change_log_level [debug|info|warn|error]

Dumping information about every client queue

bin/kraken dump_queues

Dumping the list of topics for a particular client's queue

bin/kraken dump_queue_topics <pid from dump_queues>

Dumping all topics with a count of subscribers

bin/kraken dump_topics

Configuring Kraken

Before running Kraken in production, you will want to customize some of the config options. Kraken is built as a standard OTP application, so you can modify config options directly from the command line or by specifying a custom erlang config.

Supported options

  • pid_file: If specified, then the system process id of the erlang node will be written to this file.
  • listen_ip: The IP address for the Kraken server to listen to new connections on.
  • tcp_server_port: The port for the Kraken server to listen to new connections on.
  • num_router_shards: The number of router shards to run. A good starting point is 2x the number of cores on the machine.
  • router_min_fanout_to_warn: Octopus will log warnings if a message ends up being distributed to this many or more subscribers.

Specifying options at the command line

You can specify Kraken options at the command line when starting Kraken as follows:

bin/kraken run -kraken num_router_shards 8 -kraken router_min_fanout_to_warn 1000

You need to prefix each argument with "-kraken" to let erlang know that you are customizing the kraken application environment. Erlang lets you run multiple applications on a single node.

Specifying options in a config file

Kraken options can also be specified in an erlang config file. Here is an example config file:

[{octopus, [
   {pid_file, "/var/run/kraken.pid"},
   {log_file, "/var/log/kraken.log"},
   {max_tcp_clients, 30000},
   {num_router_shards, 8}]}].

If you stored the config file in /etc/kraken.config, you could tell Erlang to use the config when you start it as follows:

bin/kraken start -config /etc/kraken

Note that Erlang requires you to exclude the extension when you specify the config file.

Kraken clients

Kraken currently includes two official clients for Erlang and Node.js. The Kraken protocol is based on the Memcached protocol, so it shouldn't take very long to create a client in the language of your choice. Please let us know if you create a new client!

Here is an example of working with Kraken using the Node.js client:

js> kraken1 = new Kraken("localhost", 12355);
js> kranen2 = new Kraken("localhost", 12355);
js> kraken1.subscribe(["topicA", "topicB"]);
js> kraken2.publish(["topicA"], "hi there!");
js> console.log(kraken1.receiveMessages());
js> kraken2.unsubscribe(["topicA"]);

As you see above, the Memcached based protocol requires clients to poll for new messages. Most good message bus proctocols (like AMQP) have some kind of polling in the form of a heart beat so that clients can detect dead connections sooner than later. In Kraken, the receive command is the way to receive new messages and the heartbeat at the same time. A decent machine should be able to handle thousands of clients polling once every couple of seconds without a problem. It probably wouldn't take very long to add a new protocol to Kraken that pushes messages to clients if you need it! Kraken was designed with the goal of supporting multiple protocols in the future.

How do I use Kraken to build realtime apps?

Kraken was designed to forward data invalidation messages between application servers. It's up to the application designer to figure out how to scope these messages to topics, and what they should contain. For example, a simple TODO list app may have topics corresponding to each of the lists that a user can see. This app would publish invalidation messages corresponding to the ids of tasks that have changed through the topics corresponding to the lists that the task is and was a member of. When other application servers receive these messages, they would reload the state of the tasks referenced in the invalidations messages to ensure they are still up to date.

How does Kraken scale?

As far as we know, very well. Kraken has been powering the Asana service since mid 2010, and has yet to crash or fail in any way. At Asana, we have 10s of thousands of clients connected to each Kraken node.

There are two ways of scaling Kraken beyond a single machine:

  1. You can shard the topic space so that each machine is responsible for a portion of the topics. This will typically decrease the total number of messages that a given node needs to process and reduce the amount of memory required to keep track of all the routing information.

  2. You can run Kraken nodes that proxy to other Kraken nodes. The proxy nodes will aggregate connections and routing information from their clients and forward on the minimal amount of information necessary to ensure they stay up to date. The proxy nodes then become a single client to the Kraken nodes that they connect to, substantially decreasing the total number of clients and messages that any single Kraken node needs to handle!

Authors and Contributors

Kraken was developed at Asana. The original version was written by Kris Rasmussen (@krisr) in 2010. It was rewritten to be retroactive by Samvit Ramadurgam (@samvit) in 2013. The Kraken mascot was designed by Stephanie Hornung.

Support or Contact

Having trouble with Kraken? Check out the documentation at https://github.com/Asana/Kraken/wiki or file an issue at https://github.com/Asana/Kraken/issues and we’ll help you sort it out.

Current Development

  • Retroactive Subscription: Sometimes clients want to subscribe to topics retroactively and receive messages that have already flown through the system. This is useful for situations where clients want to avoid repeated synchonous roundtrips to kraken as the set of topics they are interested in expands, but don't want to miss out on messages that get sent between the times when a subscription is needed and when the batch-subscription is actually established.

More Repositories

1

Drawsana

An open source library that lets your users draw on things - mark up images with text, shapes, etc.
Swift
633
star
2

typed-react

A binding layer between React and TypeScript
TypeScript
373
star
3

python-asana

Official Python client library for the Asana API v1
Python
280
star
4

node-asana

Official node.js and browser JS client for the Asana API v1
JavaScript
249
star
5

Chrome-Extension-Example

Sample application illustrating use of the Asana API
JavaScript
232
star
6

php-asana

Official PHP client library for the Asana API v1
PHP
131
star
7

locheck

Validate iOS, Android, and Mac localizations. Find errors in .strings, .stringsdict, and strings.xml files.
Swift
92
star
8

bazels3cache

Small web server for a Bazel cache, proxies to S3; allows Bazel to work offline; async uploads to make Bazel faster
TypeScript
78
star
9

ruby-asana

Official Ruby client library for the Asana API v1
Ruby
76
star
10

bazeltsc

TypeScript compiler that knows how to run as a Bazel "persistent worker"
TypeScript
39
star
11

java-asana

Official Java client library for the Asana API v1
Java
35
star
12

create-app-attachment-github-action

TypeScript
29
star
13

asana2sql

Utility for exporting Asana data to SQL databases
Python
23
star
14

comment-on-task-github-action

TypeScript
23
star
15

devrel-examples

A place to share some examples from our Developer Relations team for commonly-asked-about workflows.
Python
22
star
16

api-explorer

React component to explore the Asana API
TypeScript
20
star
17

omniauth-asana

Official Asana strategy for OmniAuth
Ruby
16
star
18

SGTM

Python
12
star
19

asana-api-meta

Metadata for Asana API for generating client libraries and documenation
HTML
11
star
20

kraken-node-client

A nodejs client for the Kraken pubsub server
JavaScript
10
star
21

node-asana-phrase

A random error phrase generator used to create memorable error codes, as used by Asana.
JavaScript
9
star
22

tsutil

TypeScript Utility Data Structures
TypeScript
9
star
23

typescript-namespace-imports-vscode-plugin

A VSCode plugin that makes it easier to automatically include TypeScript namespace imports.
TypeScript
6
star
24

asana-shift

A small node script which uses the Asana API to shift all task start and due dates relative to a project's due date.
TypeScript
5
star
25

markdown-formatter

JavaScript
5
star
26

random-one-on-one

Python
5
star
27

app-components-example-app

app-components-example-app
JavaScript
4
star
28

ohmega

The Asana Ohmega process automation toolkit
Python
4
star
29

openapi

Python
4
star
30

sshca

Certificate authority for OpenSSH
Python
3
star
31

jira-server-plugin

Asana for Jira Server
3
star
32

node-asana-preview

A preview of Asana's new node client library
JavaScript
3
star
33

app-components-rule-action-example-app

JavaScript
3
star
34

python-asana-preview

A preview of Asana's new python client library
Python
3
star
35

node-sync-to-github

A node library that makes it easy to sync a directory of files to a GitHub repo using the GitHub API
JavaScript
3
star
36

deprovision_inactive_guests

A small script which uses the Asana API to remove external users (ie without a company email) from an organization if they haven't logged in for 30 days
JavaScript
2
star
37

archie

Python
2
star
38

formula-custom-fields

JavaScript
1
star
39

node-linux-fork

An implementation of fork() for Node.JS in Linux (requires a custom Node.JS build)
C++
1
star