• Stars
    star
    122
  • Rank 292,031 (Top 6 %)
  • Language
  • Created almost 8 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Interesting papers I'd like to implement (or at least have implementations of)
This is a list of papers I would like to implement, or would like to have an
implementation of.  This list is likely to change as my interests change,
including deletions.  Do not expect this list to remain static.

These are in no particular order:

SONIK:  Efficient In-situ All Item Rank Generation using Bit Operations
    - https://arxiv.org/abs/1605.06992

CAMP: A Cost Adaptive Multi-Queue Eviction Policy for Key-Value Stores
    - http://dblab.usc.edu/users/papers/CAMPTR.pdf

SimString: A fast and simple algorithm for approximate string matching/retrieval
http://www.chokkan.org/software/simstring/

Simpira: cryptographic permutations designed to be fast on modern 64-bit
processors, yet provide a comfortable security margin against all
currently-known attacks.
    - http://mouha.be/simpira/

Autoscaling Bloom Filter: Controlling Trade-off Between True and False Positives
    - https://arxiv.org/abs/1705.03934

Adaptive Cuckoo-Filters
    - https://arxiv.org/abs/1704.06818

Continuous Top-k Queries over Real-Time Web Streams
    - https://arxiv.org/abs/1610.06500

A practical index for approximate dictionary matching with few mismatches
    - https://arxiv.org/abs/1501.04948

Robust benchmarking in noisy environments
    - https://arxiv.org/abs/1608.04295

Fast intersection of sorted lists with SSE:
    - https://highlyscalable.wordpress.com/2012/06/05/fast-intersection-sorted-lists-sse/
    - Also, https://arxiv.org/abs/1401.6399

PAD: Performance Anomaly Detection in Multi-Server Distributed Systems
    https://www.microsoft.com/en-us/research/wp-content/uploads/2014/06/PAD-Performance-Anomaly-Detection-in-Multi-Server-Distributed-Systems.pdf

Detecting Abnormal Machine Characteristics in Cloud Infrastructures
    - https://ti.arc.nasa.gov/publications/4268/download/

PerfAugur: Robust Diagnostics for Performance Anomalies in Cloud Services
    - https://www.microsoft.com/en-us/research/publication/perfaugur-robust-diagnostics-for-performance-anomalies-in-cloud-services/

Statistical Techniques for Online Anomaly Detection in Data Centers
    - http://www.hpl.hp.com/techreports/2011/HPL-2011-8.pdf

Fast table-driven base64 encoding/decoding:
    - https://github.com/powturbo/TurboBase64/blob/master/turbob64d.c

Assembly versions of hash functions / cryptographic algorithms:
    - t1ha (Go version: https://github.com/dgryski/go-t1ha )
    - rc5 / rc6 (Go version: https://github.com/dgryski/go-rc5 / https://github.com/dgryski/go-rc6 )

In-memory data layout for Netflix's Hollow:
    - http://hollow.how/advanced-topics/#in-memory-data-layout

Omnisearch Index Formats
    - https://blog.twitter.com/2016/omnisearch-index-formats

NORX8 and NORX16: Authenticated Encryption for Low-End Systems
    - https://eprint.iacr.org/2015/1154

LightMAC: A MAC Mode for Lightweight Block Ciphers:
    - https://eprint.iacr.org/2016/190.pdf

Fast Deterministic Selection (adaptive QuickSelect)
    - https://arxiv.org/abs/1606.00484

A Bloom filter based semi-index on q-grams
    - https://arxiv.org/abs/1507.02989

Faster Population Counts using AVX2 Instructions
    - https://arxiv.org/abs/1611.07612

Quasi-Succinct Indices (compressed inverted indexes):
    - http://vigna.di.unimi.it/ftp/papers/QuasiSuccinctIndices.pdf

Efficient Summing over Sliding Windows (stream statistics)
    - http://arxiv.org/pdf/1604.02450v1.pdf

A Novel Technique for Long-Term Anomaly Detection in the Cloud
    - https://www.usenix.org/system/files/conference/hotcloud14/hotcloud14-vallis.pdf
    - Twitter's anomaly detection algorithm
    - related, http://www.ebaytechblog.com/2015/08/19/statistical-anomaly-detection/
    - related, http://nerds.airbnb.com/anomaly-detection/

TinySet - An Access Efficient Self Adjusting Bloom Filter Construction
    - http://www.cs.technion.ac.il/users/wwwb/cgi-bin/tr-get.cgi/2015/CS/CS-2015-03.pdf

Detecting Change in Data Streams:
    - https://cs.uwaterloo.ca/~shai/vldb04.pdf

Hierarchical Delta Debugging:
    - https://blog.acolyer.org/2015/11/17/hierarchical-delta-debugging/
    - (to go with https://github.com/dgryski/go-ddmin )

FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space
    - http://cs.fit.edu/~pkc/papers/tdm04.pdf
    - many implementations to use as base, for example https://github.com/slaypni/fastdtw/blob/master/fastdtw.py

Mining frequent items in the time fading model
    - http://arxiv.org/pdf/1601.03892v1.pdf

Hierarchical Agglomerative Clustering:
    - http://nlp.stanford.edu/IR-book/html/htmledition/hierarchical-agglomerative-clustering-1.html
    - needed for https://www.microsoft.com/en-us/research/wp-content/uploads/2016/07/rebucket-icse2012.pdf
    - preliminary implementation of rebucket:  https://github.com/dgryski/go-rebucket

Balanced Allocation: Patience is not a Virtue (FirstDiff load balancing):
    - http://arxiv.org/abs/1602.08298

Continuously Maintaining Quantile Summaries of the Most Recent N Elements over a Data Stream
    - http://www.cs.ubc.ca/~xujian/paper/quant.pdf

The Eternal Sunshine of the Sketch Data Structure
    - http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.146.2889&rep=rep1&type=pdf

Copysets and Chainsets: A Better Way to Replicate
    http://hackingdistributed.com/2014/02/14/chainsets/

A Fast Algorithm for Approximate Quantiles in High Speed Data Streams
    - http://web.cs.ucla.edu/~weiwang/paper/SSDBM07_2.pdf
    - this algorithm has haunted me for ages, I could never get my code working
    - unresponsive authors, details missing from papers, etc
    - there now appear to be more implementations that could be used as a base

More Repositories

1

go-perfbook

Thoughts on Go performance optimization
10,631
star
2

awesome-consensus

Awesome list for Paxos and friends
2,026
star
3

awesome-go-style

A collection of Go style guides
970
star
4

go-tsz

Time series compression algorithm from Facebook's Gorilla paper
Go
538
star
5

semgrep-go

Go rules for semgrep and go-ruleguard
Go
455
star
6

dgoogauth

Google Authenticator for Go
Go
419
star
7

go-jump

go-jump: Jump consistent hashing
Go
382
star
8

trifles

A playground for things that aren't interesting enough to have their own repo.
Go
330
star
9

go-tinylfu

TinyLFU cache admission policy
Go
251
star
10

go-farm

go-farm: a pure-Go farmhash implementation
Go
238
star
11

vim-godef

vim plugin providing godef support
Vim Script
219
star
12

go-simstore

simhash storage and searching
Go
139
star
13

go-bloomindex

Bloom-filter based search index
Go
122
star
14

dkeyczar

Port of Google's Keyczar cryptography library to Go
Go
111
star
15

go-xxh3

xxh3 fast hash function
Go
104
star
16

dmrgo

Go library for writing standalone Map/Reduce jobs or for use with Hadoop's streaming protocol
Go
104
star
17

go-metro

Go translation of MetroHash
Go
101
star
18

go-maglev

Go implementation of maglev hashing
Go
92
star
19

go-topk

Streaming TopK estimates
Go
83
star
20

hokusai

hokusai -- sketching streams in real-time
Go
79
star
21

go-highway

Go implementation of Google's HighwayHash
Python
74
star
22

go-boomphf

Fast and scalable minimal perfect hashing for massive key sets
Go
71
star
23

go-lttb

Implementation of Largest-Triangle-Three-Buckets down-sampling algorithm
Go
70
star
24

go-bitstream

go-bitstream: read and write bits from io.Reader and io.Writer
Go
68
star
25

dgohash

A collection of well-known string hash functions, implemented in Go
Go
66
star
26

go-failure

Phi Accrual Failure Detection
Go
65
star
27

go-mph

minimal perfect hash functions
Go
62
star
28

go-rendezvous

rendezvous hashing
Go
61
star
29

go-ketama

Ketama implementation compatible with Algorithm::ConsistentHash::Ketama
Go
59
star
30

go-identicon

Create simple visual hashes of data, similar to github's identicons.
Go
58
star
31

talks

Go
58
star
32

gttp

gttp: http for gophers
Go
58
star
33

bread

Notes on bread baking
54
star
34

libchash

simple consistent hashing implementation
C
53
star
35

go-change

Online Change Detection Algorithm
Go
53
star
36

gophervids

Proof of concept Gopher Video player
HTML
51
star
37

go-onlinestats

One-pass running statistics
Go
51
star
38

go-gk

gk: streaming quantiles
Go
43
star
39

go-bits

amd64 optimized bit operations
Go
41
star
40

go-minhash

BottomK minwise hashing for streaming set similarity
Go
41
star
41

go-mpchash

Multi-probe consistent hashing
Go
40
star
42

go-pcgr

pcg random number generator
Go
40
star
43

go-sequitur

Sequitur algorithm for recognizing lexical structure in strings
Go
39
star
44

go-groupvarint

SSE-optimized group varint integer encoding
Go
38
star
45

go-discreterand

Return random values sampled from a discrete distribution
Go
38
star
46

go-shardedkv

sharded key-value store compatible with p5-ShardedKV
Go
35
star
47

go-arc

adaptive replacement cache
Go
35
star
48

go-sip13

siphash 1-3
Go
35
star
49

go-trigram

Small trigram indexer
Go
34
star
50

go-wyhash

wyhash fast non-cryptographic string hash
Go
34
star
51

go-kll

KLL sketch: Almost Optimal Streaming Quantiles
Go
33
star
52

go-clockpro

go-clockpro: CLOCK-Pro cache eviction algorithm
Go
33
star
53

go-ddmin

ddmin test case minimization algorithm
Go
31
star
54

go-fastquantiles

approximate streaming quantiles
Go
31
star
55

go-linebreak

Line breaking in linear time
Go
30
star
56

rgip

rgip: restful geoip service
Go
30
star
57

go-skip32

Skip32 integer obfuscation routines
Go
29
star
58

go-s4lru

s4lru cache
Go
28
star
59

go-yubicloud

go-yubicloud: Client for Yubico's OTP Validation Service
Go
27
star
60

go-fuzzstr

Fuzzy text searching like Sublime Text
Go
27
star
61

go-cuckoof

Go implemetation of cuckoo filters
Go
26
star
62

go-multiq

multiq: a relaxed, concurrent priority queue
Go
24
star
63

go-subset

deterministic subsetting
Go
24
star
64

ragel-examples

Go
23
star
65

go-duoweb

Duo Security two-factor authentication for Go web applications
Go
23
star
66

go-yubiauth

Yubikey Authorization Server
Go
23
star
67

haiku-finder

A program to search text files for sentences that match 5-7-5 a syllable count.
Go
22
star
68

go-xoshiro

xoshiro256** random number generator
Go
22
star
69

go-t1ha

Go implementation of the t1ha hash function
Go
21
star
70

go-hollow

Hollow Heaps for Go
Go
20
star
71

dpc

beginnings of a toy pascal compiler
Go
20
star
72

go-holtwinters

Holt-Winters forecasting
Go
20
star
73

go-keyless

Client and server reimplementation of CloudFlare's Keyless
Go
19
star
74

go-timewindow

Counters over sliding windows
Go
19
star
75

modelchecking

model checking samples
Go
17
star
76

dgobloom

A simple Bloom Filter implementation in Go
Go
17
star
77

dhd

hexdumper with tcp proxy support
Go
17
star
78

go-gramgen

Simple generative fuzzer
Go
16
star
79

go-expirecache

Simple expiring cache
Go
16
star
80

peachpy-examples

Python
15
star
81

go-stampede

Optimal cache stampede prevention
Go
15
star
82

go-disco

discohash
Go
15
star
83

go-xoroshiro

Go implementation of xoroshiro128+ RNG
Go
15
star
84

go-fastlz

Go implementation of FastLZ compression
Go
14
star
85

numerical-rs

Numerical integration routines for Rust
Rust
14
star
86

mph-rs

minimal perfect hashing for rust
Rust
13
star
87

go-tinymap

tinymap is a small map implementation
Go
13
star
88

go-lzo

Go wrapper for LZO compression library
Go
13
star
89

go-interp

Interpolation search
Go
12
star
90

go-zlatlong

zlatlong -- Microsoft's lat/long compression algorithm
Go
12
star
91

go-spooky

Spooky Hash
Go
12
star
92

go-marvin32

Assembly-optimized Marvin32 hash function
Go
12
star
93

go-cobs

Consistent Overhead Byte Stuffing encoding for Go
Go
11
star
94

go-postings

Search engine postings list with support for compresison
Go
11
star
95

cobs-rs

consistent overhead byte stuffing
Rust
11
star
96

go-bloomf

Simple bloom filter
Go
11
star
97

go-rebucket

ReBucket: group panic()s by similarity
Go
11
star
98

go-abitvec

atomic bitvector
Go
11
star
99

go-csnappy

go-csnappy wraps the snappy compression library
Go
11
star
100

go-siphasm

siphasm: fast amd64 siphash-2-4
Go
10
star