• Stars
    star
    264
  • Rank 155,103 (Top 4 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created over 11 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An efficient sparse bit set implementation for Java

SparseBitSet

TL;DR

Basically, if you need to set a large number of bits, or bits at extremely high offsets, you probably want to use this Sparse BitSet. All other alternatives are essentialy off the table; the Java BitSet class is a non-starter. Performance is superior in almost all cases to the standard Java BitSet.

Preface

You know how the internets are; a link that exists today is gone tomorrow. A while ago I had a need for an efficient Sparse BitSet in Java, and found a presentation and code by Dr. Bruce K. Haddon. Going back later, I found the links I had used to find it were dead. Some internet sleuthing later, I found and contacted Dr. Haddon and he was kind enough to send me the presentation again. I have created this project to capture the code for others, as well as the presentation. I can take credit for neither.

Maven

<dependency>
   <groupId>com.zaxxer</groupId>
   <artifactId>SparseBitSet</artifactId>
   <version>1.2</version>
   <scope>compile</scope>
</dependency>

The Problem and Alternatives

The standard Java BitSet is terribly memory inefficient. To store a single bit using BitSet at bit 232-1 takes 227 32-bit words (226 64bit β€œwords”), not counting any Java object overhead.

Using a HashSet of Integers results in (for each bit), 7 32-bit words overhead, or for 64 bits ~448 32-bit words overhead.

Using a HashMap, where the key = bitvalue / 64, and the value is a Long of packed bits, results in (for 64 bits) ~8 32-bit words overhead.

Using a custom hash table, where the key is an int = bitvalue / 64, and the value is a packed long, results in (for 64 bits) ~4 32-bit words overhead.

The Solution: SparseBitSet

Using a virtual-memory like structure, the SparseBitSet overhead is ~0.03 32-bit words overhead per 64 bits.

For a full analysis, read Dr. Haddon's slide stack.