• Stars
    star
    937
  • Rank 48,766 (Top 1.0 %)
  • Language
    C++
  • License
    Other
  • Created over 11 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Cuckoo Filter

Overview

Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient data structures to serve queries like "if item x is in a set?", they do not support deletion. Their variances to enable deletion (like counting Bloom filters) usually require much more space.

Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing (and therefore named as cuckoo filter). It is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact, thus a cuckoo filter could use less space than conventional Bloom filters, for applications that require low false positive rates (< 3%).

For details about the algorithm and citations please use:

"Cuckoo Filter: Practically Better Than Bloom" in proceedings of ACM CoNEXT 2014 by Bin Fan, Dave Andersen and Michael Kaminsky

API

A cuckoo filter supports following operations:

  • Add(item): insert an item to the filter
  • Contain(item): return if item is already in the filter. Note that this method may return false positive results like Bloom filters
  • Delete(item): delete the given item from the filter. Note that to use this method, it must be ensured that this item is in the filter (e.g., based on records on external storage); otherwise, a false item may be deleted.
  • Size(): return the total number of items currently in the filter
  • SizeInBytes(): return the filter size in bytes

Here is a simple example in C++ for the basic usage of cuckoo filter. More examples can be found in example/ directory.

// Create a cuckoo filter where each item is of type size_t and
// use 12 bits for each item, with capacity of total_items
CuckooFilter<size_t, 12> filter(total_items);
// Insert item 12 to this cuckoo filter
filter.Add(12);
// Check if previously inserted items are in the filter
assert(filter.Contain(12) == cuckoofilter::Ok);

Repository structure

  • src/: the C++ header and implementation of cuckoo filter
  • example/test.cc: an example of using cuckoo filter
  • benchmarks/: Some benchmarks of speed, space used, and false positive rate

Build

This libray depends on openssl library. Note that on MacOS 10.12, the header files of openssl are not available by default. It may require to install openssl and pass the path to lib and include directories to gcc, for example:

$ brew install openssl
# Replace 1.0.2j with the actual version of the openssl installed
$ export LDFLAGS="-L/usr/local/Cellar/openssl/1.0.2j/lib"
$ export CFLAGS="-I/usr/local/Cellar/openssl/1.0.2j/include"

To build the example (example/test.cc):

$ make test

To build the benchmarks:

$ cd benchmarks
$ make

Install

To install the cuckoofilter library:

$ make install

By default, the header files will be placed in /usr/local/include/cuckoofilter and the static library at /usr/local/lib/cuckoofilter.a.

Contributing

Contributions via GitHub pull requests are welcome. Please keep the code style guided by Google C++ style. One can use clang-format with our provided .clang-format in this repository to enforce the style.

Authors

More Repositories

1

libcuckoo

A high-performance, concurrent hash table
C++
1,540
star
2

epaxos

Go
610
star
3

SuRF

First Practical and General-purpose Range Filter
C++
533
star
4

rdma_bench

A framework to understand RDMA
C
367
star
5

mica

MICA: A Fast In-memory Key-Value Store (see isca2015 branch for the ISCA2015 version)
C
203
star
6

memc3

MemC3
C
150
star
7

HOPE

Order-preserving key encoder
C++
122
star
8

fasst

Source code for our OSDI 2016 paper
C++
106
star
9

HERD

Java
62
star
10

cicada-engine

The Cicada engine
C++
56
star
11

paper_skel

A LaTeX paper skeleton for CS systems conference formats
Python
55
star
12

ffbf

Feed-forward Bloom filters
C
52
star
13

rankselect

Space-Efficient, High-Performance Rank & Select Structures on Uncompressed Bit Sequences
C++
51
star
14

mica2

A fast in-memory key-value store
C++
49
star
15

libcuckoo-c

High-performance Concurrent Cuckoo Hashing Library
47
star
16

gopt

Fast packet processing using CPUs
C
38
star
17

go-cuckoo

A high-performance, memory-efficient concurrent hash table
Go
36
star
18

cuckooswitch

A software-based Ethernet switch design built around a memory-efficient, high-performance, and highly-concurrent hash table for compact and fast FIB lookup
C
34
star
19

fast-succinct-trie

C++
25
star
20

nvram

Tools for safe management of persistent main memory.
C
25
star
21

libinger

https://www.usenix.org/conference/atc20/presentation/boucher
Rust
24
star
22

qlease

Go
22
star
23

cicada

Dependably fast multi-core in-memory transactions
19
star
24

gobin-codegen

Automatic codegen for encoding/binary marshaling
Go
17
star
25

cicada-exp-sigmod2017

Cicada SIGMOD 2017 evaluation
Python
16
star
26

microservices_microbenchmarks

Code for the benchmarks presented in https://www.usenix.org/conference/atc18/presentation/boucher
Rust
12
star
27

ARF

C++
11
star
28

msls-eval

Accurate and Fast Evaluation of Multi-Stage Log-Structured Designs
C++
10
star
29

catbench

CATBench, the Intel Cache Allocation Technology benchmarking suite described in our tech report, "Simple Cache Partitioning for Networked Workloads"
C
10
star
30

cicada-exp-sigmod2017-DBx1000

A fork of DBx1000 for Cicada SIGMOD 2017 evaluation
C++
9
star
31

SuRF-demo

SuRF demo
C++
5
star
32

biblio

TeX
5
star
33

libas-safe

POSIX async-signal safety without thinking about it
C
3
star
34

mica2-catbench

MICA 2 and gRPC for CATBench
C++
1
star
35

cicada-exp-sigmod2017-silo

A fork of Silo for Cicada SIGMOD 2017 evaluation
C++
1
star
36

tokio

Forked from tokio-rs/tokio
Rust
1
star
37

cicada-exp-sigmod2017-silo-masstree-beta

A fork of masstree-beta (referenced by Silo) for Cicada SIGMOD 2017 evaluation
C++
1
star
38

cicada-exp-sigmod2017-ermia

A fork of ermia for Cicada SIGMOD 2017 evaluation
C++
1
star