• Stars
    star
    293
  • Rank 141,748 (Top 3 %)
  • Language
    Assembly
  • License
    MIT License
  • Created over 3 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Simple dense bitmap index in Go with binary operators

kelindar/bitmap
Go Version PkgGoDev Go Report Card License Coverage

SIMD-Vectorized Bitmap (Bitset) in Go

This package contains a bitmap implementation, backed by a slice of []uint64 and designed for dense small or medium collections. This implementation focuses on high performance by avoiding heap allocations, unrolling loops and implementing SIMD vectorization in assembly.

Features

  • Optimized for zero heap allocation for all of the important methods of the bitmap.
  • Optimized by vectorized instructions (SIMD) used for certain operations such as boolean algebra.
  • Support for boolean algebra that makes it perfect to implement bitmap indexes.
  • Support for bit counting with operations such Min(), Max(), Count() and more.
  • Support for fast iteration over bits set to one by using an unrolled loop.
  • Support for in-place filtering based on a user-defined predicate.
  • Support for binary encoding and can be read/written and has a no-copy slice conversion.
  • Support for reusability by providing Clone() and Clear() operations.

Documentation

The general idea of this package is to have a dead simple way of creating bitmaps (bitsets) that provide maximum performance on the modern hardware by using vectorized single-instruction multiple data (SIMD) operations. As opposed to something as roaring bitmaps which are excellent for sparse data, this implementation is designed to be used for small or medium dense bit sets. I've used this package to build a columnar in-memory store, so if you want to see how it can be used for indexing, have a look at kelindar/column. I'd like to specifically point out the indexing part and how bitmaps can be used as a good alternative to B*Trees and Hash Maps.

First, here's what you need to do in order to import this package.

import "github.com/kelindar/bitmap"

Boolean Algebra

Perhaps one of the most useful features of this package is the vectorized implementation of boolean operations allowing us to perform boolean algebra on multiple bitmaps. For example, let's imagine that we have a dataset containing books, and four bitmaps defining one of the four properties of each book. In the figure below, you can imagine that our books can be on "columns" and each bit in a bitmap defines whether this attribute exists on a book or not.

kelindar/bitmap

Now, if we want to find all books that were recently published and have an ebook available, we can use an And() method on our two bitmaps in order to combine them. In the example below we retrieve 3 hypothetical bitmaps and combine them to answer our query by calling and And() method to mutate the books bitmap twice.

books  := bitmapFor("books")           // bitmap.Bitmap
recent := bitmapFor("books_recent")    // bitmap.Bitmap
ebooks := bitmapFor("books_has_ebook") // bitmap.Bitmap

// And operation actually mutates our "books" bitmap
books.And(recent)
books.And(ebooks)

kelindar/bitmap

Now, what if we want to find recently published books which has e-book available but are not best-sellers? In that case, we could use binary AndNot() operation that hardware exposes. In the example below we combine

books.And(recent)
books.And(ebooks)
books.AndNot(bestsellers) 

kelindar/bitmap

Single Bit Operations

When dealing with single elements, this package supports simple single-bit operations. They include Set() and Remove() to set a bit to one and to zero respectively, as well as Contans() to check for a presence (value set to one) of a certain bit. These methods are simple to use and setting a bit which is out of range would automatically resize the bitmap.

In the example below we're creating a bitmap, setting one bit to one, checking its presence and setting it back to zero after.

var books bitmap.Bitmap

books.Set(3)                 // Set the 3rd bit to '1'
hasBook := books.Contains(3) // Returns 'true'
books.Remove(3)              // Set the 3rd bit to '0'

Bit Count and Search

When using a bitmap for indexing or free-list purposes, you will often find yourself in need of counting how many bits are set in a bitmap. This operation actually has a specialized hardware instruction POPCNT and an efficient implementation is included in this library. The example below shows how you can simply count the number of bits in a bitmap by calling the Count() method.

// Counts number of bits set to '1'
numBooks := books.Count()

On the other hand, you might want to find a specific bit either set to one or to zero, the methods Min(), Max() allow you to find first or last bit set to one while MinZero() and MaxZero() allow you to find first or last bit set to zero. The figure below demonstrates an example of that.

kelindar/bitmap

Iterate and Filter

The bits in the bitmap can also be iterated over using the Range method. It is a simple loop which iterates over and calls a callback. If the callback returns false, then the iteration is halted (similar to sync.Map).

// Iterate over the bits in the bitmap
bitmap.Range(func(x uint32) bool {
    println(x)
    return true
})

Another way of iterating is using the Filter method. It iterates similarly to Range but the callback returns a boolean value, and if it returns false then the current bit will be cleared in the underlying bitmap. You could accomplish the same using Range and Remove but Filter is significantly faster.

// Filter iterates over the bits and applies a callback
bitmap.Filter(func(x uint32) bool {
    return x % 2 == 0
})

Example Usage

In its simplest form, you can use the bitmap as a bitset, set and remove bits. This is quite useful as an index (free/fill-list) for an array of data.

import "github.com/kelindar/bitmap"
var books := bitmap.Bitmap
books.Set(300)      // sets 300-th bit
books.Set(400)      // sets 400-th bit
books.Set(600)      // sets 600-th bit (auto-resized)
books.Contains(300) // returns true
books.Contains(301) // returns false
books.Remove(400)   // clears 400-th bit

// Min, Max, Count
min, ok := books.Min() // returns 300
max, ok := books.Max() // returns 600
count := books.Count() // returns 2

// Boolean algebra
var other bitmap.Bitmap
other.Set(300)
books.And(other)      // Intersection
count = books.Count() // Now returns 1

Benchmarks

Benchmarks below were run on a pre-allocated bitmap of 100,000 elements containing with around 50% bits set to one.

cpu: Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
BenchmarkBitmap/set-8         552331321    4.319 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/remove-8     1000000000    1.621 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/contains-8   1000000000    1.309 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/clear-8        26083383    90.45 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/ones-8          6751939    347.9 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/min-8         757831477    3.137 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/max-8        1000000000    1.960 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/min-zero-8    776620110    3.081 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/max-zero-8   1000000000    1.536 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/count-8         6071037    382.5 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/count-to-8     82777459    28.85 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/clone-8        20654008    111.5 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/and-8          16813963    143.6 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/andnot-8       16961106    141.9 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/or-8           16999562    141.7 ns/op    0 B/op    0 allocs/op
BenchmarkBitmap/xor-8          16954036    144.7 ns/op    0 B/op    0 allocs/op
BenchmarkRange/range-8            18225   131908 ns/op    0 B/op    0 allocs/op
BenchmarkRange/filter-8           25636    93630 ns/op    0 B/op    0 allocs/op

Contributing

We are open to contributions, feel free to submit a pull request and we'll review it as quickly as we can. This library is maintained by Roman Atachiants

License

Tile is licensed under the MIT License.

More Repositories

1

column

High-performance, columnar, in-memory store with bitmap indexing in Go
Go
1,405
star
2

binary

Generic and fast binary serializer for Go
Go
181
star
3

talaria

TalariaDB is a distributed, highly available, and low latency time-series database for Presto
Go
124
star
4

tile

Tile is a 2D grid engine, built with data and cache friendly ways, includes pathfinding and observers.
Go
107
star
5

ecs

Example of Entity Component System in Go
Go
51
star
6

misakai-baker

Baker is a static site generator for C# / .Net people.
C#
41
star
7

demo-chat

A simple chat application built using emitter.io
JavaScript
31
star
8

intmap

Fast integer map for uint32-to-uint32
Go
21
star
9

gocc

Generate Go Plan9 assembly from C code
Go
20
star
10

event

Simple internal event bus for Go applications
Go
18
star
11

lua

Wrapper around LUA script executor for long-running scripts
Go
17
star
12

circular-buffer

An efficient implementation of a resizeable circular byte buffer in C#
C#
15
star
13

spike-build

Spike.Build is the set of libraries that perform parsing and code-generation for creating various client SDKs for Spike Engine.
C#
15
star
14

simd

Package with auto-vectorized math functions for Go
Assembly
13
star
15

stock-explorer

Stock trading tools experiment
Go
13
star
16

xxrand

Fast, scalable pseudo random number generator based on xxh3
Go
10
star
17

smutex

Simple implementation of a sharded mutex in Go
Go
9
star
18

analytics-dashboard

Realtime analytics dashboard of Twitter streaming data using Emitter.
C#
9
star
19

iostream

Simple binary reader and writer
Go
8
star
20

hashbench

Benchmarking hash functions in Go
Go
7
star
21

loader

Golang library that allows for downloading things from a URL (http, s3, ...)
Go
6
star
22

misakai-kafka

Misakai.Kafka is a high performance Apache Kafka client for C#.
C#
6
star
23

twitter-stream

HTML5/C# visualization of a Twitter sample stream in real-time
C#
6
star
24

evolve

Golang implementation of a binary genetic algorithm with random binary crossover & mutation
Go
6
star
25

etop

Command-line utility for emitter.io cluster status.
Go
6
star
26

demo-presence

This demo showcases presence feature of emitter.io
HTML
4
star
27

emitter-cluster

Sample configurations for emitter cluster
4
star
28

emitter-actor

Example of an Actor Model built with Emitter
Go
3
star
29

tcp

Simple tcp server in Go
Go
3
star
30

spike-box

A modern experimental web server with browser DOM to server-side javascript data binding.
JavaScript
3
star
31

docker

Various docker repositories for automated build.
Perl
3
star
32

markov

Simple Markov chain implementation
Go
3
star
33

buddy

Buddy memory allocator for hashed strings in Go
Go
3
star
34

simplex

Simplex noise in Go
Go
3
star
35

demo-paint

Collaborative paint demo for emitter.io
JavaScript
2
star
36

spiral

Generating spiral galaxy shape
Go
2
star
37

demo-microservice

Demo of emitter.io enabling a web application using docker, rancher and a set of microservices
JavaScript
1
star
38

emitter-client-server

Demo of a client-server app with Emitter & Go
Go
1
star
39

process

Golang library for process usage retrieval
Go
1
star
40

http-echo

This is a simple HTTP echo server which fails from time to time. Perfect to learn resiliency.
Go
1
star
41

pmml2lua

POC for converting PMML to LUA
Go
1
star
42

zmachine

Converting a Z-Machine in Go to an online system
Go
1
star
43

blog

1
star
44

metering

This metering plugin for emitter.io broker persists usage in Google Datastore.
Go
1
star