btree4j: Disk-based Prefix B+-tree written in Pure Java
What's Btree4j
Btree4j is a disk-based Prefix B+-tree written in Pure Java.
It's pretty fast and 100k ops/sec is expected even on laptop.
Using btree4j
<dependency>
<groupId>io.github.myui</groupId>
<artifactId>btree4j</artifactId>
<version>0.9.1</version>
</dependency>
Find usage in unit tests.
Features and Strength
Applied many improvements over the original Xindice's implementation as follows:
- Implementes Prefix B+-tree in which prefixes are selected carefully to minimize their length. In prefix B+-tree, key prefixes are managed by a TRIE-like smart algorithm.
Rudolf Bayer and Karl Unterauer. "Prefix B-trees", Proc. ACM Trans. Database Syst. 2, 1, pp.11-26), March 1977. [DOI]
-
Pointers are compressed using Variable Byte Codes so that more keys/values are fit in memory.
-
Support both unique and non-unique indexing. Storing duplicate keys is allowed for non-unique indexing.
-
Index file based on B+-tree is supported. Multiple values per a key is also supported. BTreeIndex stores
<bytes[] KEY, byte[] VALUE>
with VALUE stored on distinct data pages and pointers to them are managed by B+-Tree to avoid consuming many disk pages for large values in B+-tree. -
Prefix and range search is supported in addition to exact match using a flexible callback handler. Support LIKE match using wildcards.
-
Paging (virtual memory) support using LRU cache replacement policy and Freespace management.
-
Deletion and updates are, of course, supported.
-
Support efficient Bulk-loading.
-
Minimum dependencies to external libraries. Runs on Java 8 or later.
Sponsors
No sponsors yet. Will you be the first?
It will be my motivation to continue working on this project.
Credits
Copyright 2006 and onwards Makoto Yui
Copyright 1999-2007 The Apache Software Foundation
This software is originally developed for XBird based on Apache Xindice.