• Stars
    star
    136
  • Rank 267,670 (Top 6 %)
  • Language
    Assembly
  • License
    BSD 3-Clause "New...
  • Created over 1 year ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

May be fastest bloom filter in C++/Go/Java/C#/Python/Rust

PageBloomFilter

Bloom filter with page, designed for storage density and query speed.

Benchmark

We got average latency per operation under 25ns in a benchmark with 500k elements on a Xeon-8374C machine. SIMD brings significant speed-up.

It runs slower on EPYC-7K83 machine.

Running test with SIMD on Xeon-8475B machine, we found aesni-hash helps a lot (amazing fast test operation under 7ns).

Running test with SIMD on EPYC-9T24 machine, we found aesni-hash helps a little.

API

auto bf = NEW_BLOOM_FILTER(500, 0.01);
if (bf.set("Hello")) {
    std::cout << "set new Hello" << std::endl;
}
if (bf.test("Hello")) {
    std::cout << "find Hello" << std::endl;
}

Go Version

// BloomFilter with 0.01 false positive rate for 500 items
bf := NewBloomFilter(500, 0.01)
if bf.Set("Hello") {
    fmt.Println("set new Hello")
}
if bf.Test("Hello") {
    fmt.Println("find Hello")
}

Function injection techique is avaliable for AMD64. It uses code compiled by clang in Go without CGO. A benchmark with 500k elements on a Xeon-8374C machine shows new implement runs much fast than the pure go implement.

name   old time/op  new time/op  delta
Set4   53.6ns ± 6%  26.5ns ± 6%  -50.52%  (p=0.000 n=20+20)
Test4  40.5ns ± 5%  21.2ns ± 5%  -47.63%  (p=0.000 n=20+18)
Set5   56.4ns ± 5%  28.0ns ± 5%  -50.34%  (p=0.000 n=20+19)
Test5  41.5ns ± 3%  18.8ns ± 7%  -54.72%  (p=0.000 n=20+19)
Set6   57.6ns ± 5%  29.1ns ± 5%  -49.44%  (p=0.000 n=20+20)
Test6  42.2ns ± 4%  18.5ns ± 7%  -56.22%  (p=0.000 n=20+18)
Set7   58.8ns ± 4%  30.8ns ± 9%  -47.68%  (p=0.000 n=20+20)
Test7  43.9ns ± 6%  18.9ns ± 8%  -56.98%  (p=0.000 n=20+19)
Set8   58.4ns ± 9%  32.4ns ± 5%  -44.53%  (p=0.000 n=20+19)
Test8  44.8ns ± 2%  18.4ns ± 7%  -58.86%  (p=0.000 n=19+20)

We suggest that user should execute go-inject.sh to gnerate new injecting code before build. Clang, binutils and python are needed.

Benchmark shows it runs 3x time faster than another famous bloom filter implement bits-and-blooms:

cpu: Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz
BenchmarkBitsAndBloomSet-6               1000000               140.0 ns/op
BenchmarkBitsAndBloomTest-6              1000000                81.68 ns/op
BenchmarkPageBloomFilterSet-6            1000000                32.12 ns/op
BenchmarkPageBloomFilterTest-6           1000000                20.58 ns/op

Java Version

PageBloomFilter bf = PageBloomFilter.New(500, 0.01);
byte[] hello = "Hello".getBytes("UTF-8");
if (bf.set(hello)) {
    System.out.println("set new Hello");
}
if (bf.test(hello)) {
    System.out.println("find Hello");
}

Benchmark shows it runs much faster than Google's Guava and Alexandr Nikitin's bloom-filter-scala. We see Java version without dedicated optimization is inferior to the Go version.

// i7-10710U & OpenJDK-17
pbf-set:      50.962245 ns/op
pbf-test:     40.465323 ns/op
guava-set:   133.513980 ns/op
guava-test:  112.318321 ns/op
nikitin-set:  86.930928 ns/op
nikitin-test: 62.133052 ns/op

C# Version

var bf = PageBloomFilter.New(500, 0.01);
var hello = Encoding.ASCII.GetBytes("Hello")
if (bf.Set(hello)) {
    Console.WriteLine("set new Hello");
}
if (bf.Test(hello)) {
    Console.WriteLine("find Hello");
}

C# code is very similar to Java code, but runs slower.

// i7-10710U & .NET-7
pbf-set:  83.461274 ns/op
pbf-test: 74.953785 ns/op

Python Version

bf = pbf.create(500, 0.01)
if bf.set("Hello"):
    print("set new Hello")
if bf.test("Hello"):
    print("find Hello")

Python with c extension is still slow, but 8x time faster than pybloom.

// i7-10710U & Python-3.11.3
pbf-set:       307.835638 ns/op
pbf-test:      289.679349 ns/op
pybloom-set:  2770.372372 ns/op
pybloom-test: 2417.377588 ns/op

Rust Version

let mut bf = pbf::new_bloom_filter(500, 0.01);
let hello = "Hello".as_bytes();
if (bf.set(hello)) {
    println!("set new Hello");
}
if (bf.test(hello)) {
    println!("find Hello");
}

Rust verison is also lack of dedicated optimiztion, but faster than Java version a lot. I believe it sould be faster, because I am new at this language.

// i7-10710U & Rust-1.65.0
pbf-set:  45.99ns/op
pbf-test: 27.81ns/op

Ranking

With test data on i7-10710U machine, we got performance rank: C++, Go, Rust, Java, C#, Python.

Theoretical Analysis

Bytes per element - False positive rate

Occupied ratio - False positive rate


【Chinese】 【English】