Awesome HBase
A curated list of awesome HBase projects and resources.
HBase is a distributed, scalable, big data store.
Contents
Projects
Clients
- asynchbase - Fully asynchronous, non-blocking HBase client.
- gohbase - Pure Go client for HBase.
- happybase - Python client for HBase.
Cloud
- Amazon EMR - Amazon's Hadoop/HBase offering on AWS.
- Azure HDInsight - Microsoft's Hadoop/HBase offering on Azure.
- Cloudera Director - Run Hadoop/HBase clusters on AWS, Azure or Google Cloud.
- Google Cloud Bigtable - High-performance NoSQL database service accessible via HBase client API.
- Hortonworks Cloudbreak - Provision Hadoop/HBase clusters on AWS, Azure, Google Cloud, or OpenStack.
Frameworks
Datasets
- Kite - High-level data layer for Hadoop/HBase.
Document
- HDocDB - HBase as a JSON document database.
Entity/JPA
- DataNucleus - JPA persistence layer with support for HBase.
- Gora - Persistence library for big data with support for HBase.
- HBase ORM - A production-grade HBase ORM library.
- HEntityDB - HBase as an entity database.
- Kundera - JPA client with support for HBase.
Geospatial
- GeoMesa - Spatial-temporal database with support for Accumulo, HBase, Cassandra, and Kafka.
Graph
- Gradoop - Research framework for scalable graph analytics built on Flink and HBase.
- HGraphDB - HBase as a TinkerPop graph database.
- HugeGraph - A graph database that supports more than 10+ billion data, high performance and scalability.
- JanusGraph - Scalable graph database with support for Cassandra, HBase, Google Cloud Bigtable, and BerkeleyDB.
- NebulaGraph - A high performance distributed Graph database.
- S2Graph - High-performance distributed graph database built on HBase.
SQL/OLAP
- AntsDB - AntsDB is a low latency, high concurrency, MySQL compliant SQL layer for HBase.
- EsgynDB - Commercial SQL engine providing ACID transactions and BI analytics on top of Hadoop, based on Trafodian.
- Kylin - Extreme OLAP engine for big data that stores data in HBase.
- LeanXScale - Commercial full ACID full SQL product built on Hadoop/HBase.
- Phoenix - SQL layer on top of HBase.
- Splice Machine - Commercial RDBMS built on top of HBase.
- Trafodian - Transactional SQL-on-Hadoop/HBase.
Time Series
- Axibase - Distributed time series database built on HBase.
- OpenTSDB - Scalable time series database built on HBase.
- Warp 10 - Time series database for sensor data.
Infrastructure
Secondary Indices
- hindex - Secondary index for HBase.
- Lily HBase Indexer - Quickly and easily search for content stored in HBase.
Transactions
- Haeinsa - Multi-row/multi-table transaction library for HBase.
- HBase-QoD - Vector-field consistency for HBase fine-grained transactional inter-DC replication.
- Omid - Transactional support for HBase.
- Tephra - Globally consistent transactions on top of HBase.
- Themis - Cross-row/cross-table transactions on HBase based on Google's Percolator.
Integrations
- Apex - Apex-HBase connector.
- Beam - Beam HBase integration.
- Camel - Camel HBase component.
- Cascading - HBase adapters for Cascading.
- Cascalog - Wrapper around Cascading.HBase for use in Cascalog.
- Crunch - HBase adapters for Crunch.
- Drill - HBase storage plugin for Drill.
- Elasticsearch - Elasticsearch import river for HBase.
- Flink - Flink-HBase connector.
- Gearpump - Gearpump integration for HBase.
- Giraph - Giraph input and output formats for HBase.
- HAWQ - HAWQ PXF external tables on HBase.
- Hive - Hive HBase integration.
- Impala - Impala support for querying HBase tables.
- Kafka - HBase Kafka proxy.
- Pig - Pig HBase integration.
- Presto - Presto-HBase connector.
- Pulsar - HBase connector for Pulsar.
- Ranger - HBase plugin for Apache Ranger.
- Spark - Spark-HBase connector.
- Spring for Apache Hadoop - Spring-Hadoop integration, including HBase support.
- Storm - Storm/Trident integration for HBase.
- Tajo - Tajo integration with HBase.
- Zeppelin - HBase shell interpreter for Apache Zeppelin.
Tools
- Ambari - Software for provisioning, managing, and monitor Hadoop/HBase clusters.
- Cloudera Manager - Tool for managing Hadoop/HBase in production.
- DbSchema - Diagram-oriented database designer with support for HBase.
- Hannibal - Tool to monitor and maintain HBase clusters.
- h-rider - GUI for viewing and manipulating data in HBase.
- Hue - Smart analytics workbench that includes an HBase browser.
- Sematext SPM - Tool for monitoring HBase, HDFS, etc.
Miscellaneous
- HubSpot HBase support - Configs and tools for HBase at HubSpot, including Hystrix integration and coprocessors.
Resources
Books
- HBase in Action - Experience-driven guide that shows you how to use HBase.
- HBase: The Definitive Guide - Comprehensive guide to HBase.
- Architecting HBase Applications - Includes HBase principles, cluster guidelines, and in-depth case studies.
- HBase Administration Cookbook - How to master HBase configuration and administration.
- HBase Essentials - A practical guide to using HBase.
- HBase Design Patterns - Successful patterns to develop scalable applications with HBase.
- Learning HBase - Learn the fundamentals of HBase administration and development.
- HBase High Performance Cookbook - Exciting projects that teach you how to use HBase.
- Apache HBase Primer - A compact guide to HBase essentials.
- Pro Apache Phoenix - Basic and best practices for using Phoenix.
Papers
- Bigtable: A Distributed Storage System for Structured Data - The inspiration for HBase.
- Apache Hadoop Goes Realtime at Facebook - How Facebook deployed HBase to production.