• Stars
    star
    340
  • Rank 124,317 (Top 3 %)
  • Language
    Go
  • Created almost 13 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A fault-tolerant network service for meaningful GUID generation

noeqd - A fault-tolerant network service for meaningful GUID generation

Based on snowflake.

Motivation

GUIDs (Globally Unique IDs) are useful for a number a obvious reasons: database keys, logging, etc.

Generating GUIDs with pure randomness is not always ideal because it doesn't cluster well, produces terrible locality, and no insight as to when it was generated.

This network service should also have these properties (Differences from snowflake):

  • easy distribution with no dependencies and little to no setup
  • dirt simple wire-protocol (trivial to implement clients without added dependencies and complexity)
  • low memory footprint (starts and stays around ~1MB)
  • zero configuration
  • reduced network IO when multiple keys are needed at once

Glossary of terms to follow

  • GUID: Globally Unique Identifier
  • datacenter: A facility used to house computer systems.
  • worker: A single noeqd process with a worker and datacenter ID combination unique to their cohort.
  • datacenter-id: An integer representing a particular datacenter.
  • worker-id: An integer representing a particular worker.
  • machine-id: The comination of datacenter-id and worker-id
  • twepoch: custom epoch (same as snowflake)

Important note:

Reliability, and guarantees depend on:

System clock depedency and skew protection: - (From snowflake README and slightly modified)

You should use NTP to keep your system clock accurate. Noeq protects from non-monotonic clocks, i.e. clocks that run backwards. If your clock is running fast and NTP tells it to repeat a few milliseconds, Noeq will refuse to generate ids until a time that is after the last time we generated an id. Even better, run in a mode where ntp won't move the clock backwards. See http://wiki.dovecot.org/TimeMovedBackwards#Time_synchronization for tips on how to do this.

Avoiding the reuse of a worker-id + datacenter-id too quickly

It's important to know that a newly born process has no way of tracking its previous life and where it left of. This means time could have moved backwards while it was dead.

It's important to not use the same worker-id + datacenter-id without telling the new process when to start generating new IDs to avoid duplicates.

It is only safe to reuse the same worker-id + datacenter-id when you can guarantee the current time is greater than the time of death. You can use the -t option to specifiy this.

You may have up to 1024 machine ids. It's generally safe to not reuse them until you've reached this limit.

Install

You can install noeqd by downloading the binary here and putting it in your PATH.

or

Clone the repo and build with Go (Requires Go 0beb796b4ef8 weekly/weekly.2011-12-02 or later)

	$ git clone http://github.com/bmizerany/noeqd
	$ cd noeqd
	$ make install

Run

	$ noeqd -h
	Usage of noeqd:
	  -d=0: datacenter id
	  -l="0.0.0.0:4444": the address to listen on
	  -w=0: worker id

Coordinating machine-ids

Noeq does not assume you're using any automated coordination because it isn't always correct to assume this. Its easy to do without baking it in. Here is an example script in the repo for doing so if you need it (using Doozer):

	#!/bin/sh
	# usage: ./coord-exec.sh <datacenter-id>

	did=$1
	wid=0

	[ -z "$did" ] && did=0

	_set() {
	  printf 1 | doozer set /goflake/$did/$wid 0
	}

	while ! _set
	do wid=`expr $wid + 1`
	done

	exec noeqd -w $wid -d $did

The Why

Uniqness

We must know that a GUID, once generated, has never and will never be generated again (i.e. Globally Unique) in our system.

Performance

Heroku serves many 10's of thousands of requests a second. Each request can require multiple actions that need their own id. To be on the safe side, we will require a minimum of 100k ids/sec (without network latency); possibly more in the very near future. See benchmarks near the end of this README.

Uncoordinated

We need all noeqds to be able to generate GUIDs without coordinating with other noeqd processes. Coordination requires more time complexity than if we didn't require it and reduces the amount of GUIDs we can generate during that time. It also affects the yield (the probability the service will complete a request).

Direcly sortable by time (roughly)

Noeq (like snowflake) will guarantee the GUIDs will be k-sorted within a reasonable bound (10's of ms to no more than 1s). More on this in "How it works."

References:

http://ci.nii.ac.jp/naid/110002673556

http://ci.nii.ac.jp/naid/110002673489

The "Why not snowflake?"

At Heroku, we value services that are simple, as self-contained as possible, and use nothing more than they can reasonably get away with. The setup and distribution of an application should be as quick and painless as possible. This means ruthlessly eliminating as much baggage, waste, and other overhead as possible.

How it works

GUID generation and guarantees

GUIDs are represented as 64bit integers and are composed of (as described by the snowflake README):

  • time - 41 bits (millisecond precision with a custom epoch gives us 69 years)
  • configured machine id - 10 bits - gives us up to 1024 machines
  • sequence number - 12 bits - rolls over every 4096 per machine (with protection to avoid rollover in the same ms)

Sorting - Time Ordered

Strictly sorted:

  • GUIDs generated in a single request by a worker will strictly sort.
  • GUIDs generated one second or longer apart, by more than one worker, will strictly sort.
  • GUIDs generated over multiple requests by the same worker, will strictly sort.

Roughly sorted:

  • GUIDs generated by multiple workers within a second could roughly sort.

An example of roughly sorted:

If client A requests three GUIDs from worker A in one request, and client B requests three GUIDs from worker B in another request, and both requests are processed within the same second, together they may sort like:

	GUID-A1
	GUID-A2
	GUID-B1
	GUID-B2
	GUID-A3
	GUID-B3

NOTE: The A GUIDs will strictly sort, as will B's.

Clients

Clients implement a simple wire-protocol that is specified below. Implementing a client in your favorite language is trivial and should require no dependencies.

Failure Recovery

Each client should keep a list of addresses of all known worker process (or use DNS) so that if one fails, it can move to another. To recover from a lost connection, a client should randomly select another address from its list, or in the case of DNS: reconnect using the same address allowing DNS to choose the next IP.

See noeq.go for a working example.

Protocol

Auth Request

If the server has its NOEQ_TOKEN environment variable set to an non-empty string, the server will require authentication.

	------------------------
	|0|<len byte>|token ...|
	------------------------

An auth request starts with byte(0) and then a byte that is the length of the follwing token, followed by the token string. (NOTE: If a client and server hang during authentication, it's probably because the token is the client sent is too short.)

Id Request:

	-------
	|uint8|
	-------

A request must contain only one byte. The value of the byte tells the server how many ids to respond with. A client can request up to 255 (or max uint8) ids per request.

Response:

	-------------------------------------------------- ...
	|uint8|uint8|uint8|uint8|uint8|uint8|uint8|uint8|  ...
	-------------------------------------------------- ...

Each id comes as a 64bit integer in network byte order. The number of 64bit integers returned is the request-byte * 8

Errors:

Errors are logged by the server to stdout. Clients will have their connection closed to signal the need to try elsewhere until the server can recover. This generally happens if the servers clock is running backwards.

Benchmarks (fwiw)

MacAir 3, OS X 10.7.2, 2.13 GHz Intel Core 2 Duo, 4 GB 1067 MHz DDR3)

Id Generation without encoding or network latency

This is the benchmark done by snowflake and reported in their README.

	BenchmarkIdGeneration	675 ns/op	# 1.481 million ids/s

Id Generation with encoding and without network latency

I find these benchmarks more realistic. The ids must be encoded so we want to know how fast an id can be generated and encoded in order to hit the wire. Benchmarks including a network are left as an exercise for the reader because all networks vary.

These show that when a client can safely ask for more one than one id at a time, they can reduce time to wire and the expensive read/write operations.

	BenchmarkServe01	 1677 ns/op	# 596303 ids/sec
	BenchmarkServe02	 2352 ns/op	# 850340 ids/sec
	BenchmarkServe03	 3067 ns/op	# 978155 ids/sec
	BenchmarkServe05	 4436 ns/op	# 1.127 million ids/sec
	BenchmarkServe08	 6436 ns/op	# 1.243 million ids/sec
	BenchmarkServe13	10169 ns/op	# 1.278 million ids/sec
	BenchmarkServe21	16257 ns/op	# 1.292 million ids/sec
	BenchmarkServe34	25603 ns/op	# 1.328 million ids/sec
	BenchmarkServe55	39693 ns/op	# 1.386 million ids/sec

Contributing

This is Github. You know the drill. Please make sure you keep your changes in a branch other than master and in nice, clean, atomic commits. If you modify a .go file, please use gofmt with no parameters to format it; then hit the pull-request button.

Issues

These are tracked in this repos Github issues tracker.

See Also

Noeq command line util: http://github.com/bmizerany/noeq

Noeq.go for Go: http://github.com/bmizerany/noeq.go

Thank you

I want to make sure I give the Snowflake team at Twitter as much credit as possible. The heart of this program is their doing.

LICENSE

Copyright (C) 2011 by Blake Mizerany (@bmizerany)

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

More Repositories

1

pat

Go
1,413
star
2

roundup

eliminate bugs and weeds from shell scripts
Shell
427
star
3

sinatra-activerecord

Extends Sinatra with ActiveRecord helper methods and Rake tasks (now maintained by @holman).
Ruby
194
star
4

perks

Effective Computation of Things
Go
186
star
5

pq

Go
181
star
6

assert

Asserts to Go testing
Go
142
star
7

sinatra-dj

a simple piglatin app to show off sinatra + DJ
Ruby
81
star
8

aws4

A Go package for AWS Signature version 4
Go
73
star
9

sinatra-redis

A light extension for using redis with Sinatra
Ruby
60
star
10

lizzy

An DSL for creating AMQP agents; quickly.
Ruby
53
star
11

vendor

Vendor copies go dependencies to ./vendor
Go
52
star
12

onis

runtime object introspection for ruby process
Ruby
44
star
13

pq.go

Go
39
star
14

sinatra-captcha

Quick, simple, fast captcha for Sinatra
Ruby
37
star
15

mc

Binary Memcached client for Go with SASL support
Go
34
star
16

heroku-sinatra-app

Ruby
32
star
17

aws

An old simple AWS client in Go (use bmizerany/aws4 for more up to date aws usage).
Go
26
star
18

sinatra-any

Granular before filters for sinatra
Ruby
24
star
19

redis-erl

Minimilast Redis Client for Erlang
Erlang
21
star
20

swirl

API version agnostic EC2 client
Ruby
20
star
21

beldam

clean, easy, ec2 wrapper
Ruby
14
star
22

logplex

Go
14
star
23

swirl-node

JavaScript
13
star
24

em-swirl

An evented Ruby EC2 driver for EventMachine
Ruby
12
star
25

recho

echo(1) in ruby
Ruby
11
star
26

nodefinder

Erlang node finders
11
star
27

pqx

Go
10
star
28

noeq.go

Go client for noeqd
Go
10
star
29

sinatra-slaushed

redis powered authentication for Sinatra
Ruby
9
star
30

frylock

Ruby + CLI + DSL = Command line apps made easy! (Not ready yet)
Ruby
8
star
31

supermemo

SM2 application
Ruby
7
star
32

inotify-ffi

Ruby
7
star
33

guns

It has two pipes; got tickets to the show?
Ruby
7
star
34

noeq

A command line client for noeqd
Go
6
star
35

lpx

Logplex framing parser
Go
6
star
36

lively-docs

Escape the hell of `rake rdoc` .. edit .. `rake rdoc` .. edit .. etc.
5
star
37

pat.go

Now at bmizerany/pat
Go
5
star
38

WAck

Ack! for the web.
Ruby
5
star
39

heroku-docs

Documentation for Heroku, in the form of a Sinatra app serving markdown text files.
Ruby
5
star
40

jabberlang

Erlang jabber client
5
star
41

sinatra-template-vendor

Ruby
4
star
42

domainy

for getting the base of a domain
Ruby
4
star
43

RedisPl

Redis Perl Bindings (from svn://svn.rot13.org/Redis/)
Perl
3
star
44

lameetup

Ruby
3
star
45

sinatra-sassacache

Extends Sinatra with a route to compile and cache Sass templates
3
star
46

test

a test for work
3
star
47

attrubates

Arrtubates sets meta on meta for objects and adds additional rendering helpers for ActionController::Base in Rails.
Ruby
3
star
48

lorem-me

Hacker's Lorem Ipsum Genereator with Gists.
Ruby
3
star
49

sinatra-ext-pres

Sinatra Extensions Presentation
Ruby
2
star
50

schemafinder

2
star
51

fido

he fetches
Shell
2
star
52

hk

Fast Heroku client
Go
2
star
53

Elephant

An Adium Plugin for keeping realtime backup/full-text search of all chats
Objective-C
2
star
54

pages

Go
2
star
55

aws.go

Go
2
star
56

muni

Quick muni notifications for us SF dwellers
1
star
57

doozer-bench

bechmark the doozerz
Go
1
star
58

bang

banger
Ruby
1
star
59

mc.go

Go
1
star
60

onis-web

1
star
61

vips

Go
1
star
62

fsh

Fuck syntax highlighting. The correct default for the Atom (atom.io) editor.
CSS
1
star
63

theherms

example - don't look
1
star