• Stars
    star
    392
  • Rank 109,735 (Top 3 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created over 5 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Relational Database Backed by Apache Kafka

KarelDB - A Relational Database Backed by Apache Kafka

Build Status Maven Javadoc

KarelDB is a fully-functional relational database backed by Apache Kafka.

Maven

Releases of KarelDB are deployed to Maven Central.

<dependency>
    <groupId>io.kareldb</groupId>
    <artifactId>kareldb-core</artifactId>
    <version>1.0.0</version>
</dependency>

Server Mode

To run KarelDB, download a release, unpack it, and then modify config/kareldb.properties to point to an existing Kafka broker. Then run the following:

$ bin/kareldb-start config/kareldb.properties

At a separate terminal, enter the following command to start up sqlline, a command-line utility for accessing JDBC databases.

$ bin/sqlline
sqlline version 1.9.0

sqlline> !connect jdbc:avatica:remote:url=http://localhost:8765 admin admin

sqlline> create table books (id int, name varchar, author varchar);
No rows affected (0.114 seconds)

sqlline> insert into books values (1, 'The Trial', 'Franz Kafka');
1 row affected (0.576 seconds)

sqlline> select * from books;
+----+-----------+-------------+
| ID |   NAME    |   AUTHOR    |
+----+-----------+-------------+
| 1  | The Trial | Franz Kafka |
+----+-----------+-------------+
1 row selected (0.133 seconds)

To access a KarelDB server from a remote application, use an Avatica JDBC client. A list of Avatica JDBC clients can be found here.

If multiple KarelDB servers are configured with the same cluster group ID (see Configuration), then they will form a cluster and one of them will be elected as leader, while the others will become followers (replicas). If a follower receives a request, it will be forwarded to the leader. If the leader fails, one of the followers will be elected as the new leader.

Embedded Mode

KarelDB can also be used in embedded mode. Here is an example:

Properties properties = new Properties();
properties.put("schemaFactory", "io.kareldb.schema.SchemaFactory");
properties.put("parserFactory", "org.apache.calcite.sql.parser.parserextension.ExtensionSqlParserImpl#FACTORY");
properties.put("schema.kind", "io.kareldb.kafka.KafkaSchema");
properties.put("schema.kafkacache.bootstrap.servers", bootstrapServers);
properties.put("schema.kafkacache.data.dir", "/tmp");

try (Connection conn = DriverManager.getConnection("jdbc:kareldb:", properties);
     Statement s = conn.createStatement()) {
        s.execute("create table books (id int, name varchar, author varchar)");
        s.executeUpdate("insert into books values(1, 'The Trial', 'Franz Kafka')");
        ResultSet rs = s.executeQuery("select * from books");
        ...
}

ANSI SQL Support

KarelDB supports ANSI SQL, using Calcite.

When creating a table, the primary key constraint should be specified after the columns, like so:

CREATE TABLE customers 
    (id int, name varchar, constraint pk primary key (id));

If no primary key constraint is specified, the first column in the table will be designated as the primary key.

KarelDB extends Calcite's SQL grammar by adding support for ALTER TABLE commands.

alterTableStatement:
    ALTER TABLE tableName columnAction [ , columnAction ]*
    
columnAction:
    ( ADD tableElement ) | ( DROP columnName )

KarelDB supports the following SQL types:

  • boolean
  • integer
  • bigint
  • real
  • double
  • varbinary
  • varchar
  • decimal
  • date
  • time
  • timestamp

Basic Configuration

KarelDB has a number of configuration properties that can be specified. When using KarelDB as an embedded database, these properties should be prefixed with schema. before passing them to the JDBC driver.

  • listeners - List of listener URLs that include the scheme, host, and port. Defaults to http://0.0.0.0:8765.
  • cluster.group.id - The group ID to be used for leader election. Defaults to kareldb.
  • leader.eligibility - Whether this node can participate in leader election. Defaults to true.
  • kafkacache.backing.cache - The backing cache for KCache, one of memory (default), bdbje, lmdb, mapdb, or rocksdb.
  • kafkacache.data.dir - The root directory for backing cache storage. Defaults to /tmp.
  • kafkacache.bootstrap.servers - A list of host and port pairs to use for establishing the initial connection to Kafka.
  • kafkacache.group.id - The group ID to use for the internal consumers, which needs to be unique for each node. Defaults to kareldb-1.
  • kafkacache.topic.replication.factor - The replication factor for the internal topics created by KarelDB. Defaults to 3.
  • kafkacache.init.timeout.ms - The timeout for initialization of the Kafka cache, including creation of internal topics. Defaults to 300 seconds.
  • kafkacache.timeout.ms - The timeout for an operation on the Kafka cache. Defaults to 60 seconds.

Security

HTTPS

To use HTTPS, first configure the listeners with an https prefix, then specify the following properties with the appropriate values.

ssl.keystore.location=/var/private/ssl/custom.keystore
ssl.keystore.password=changeme
ssl.key.password=changeme

When using the Avatica JDBC client, the truststore and truststore_password can be passed in the JDBC URL as specified here.

HTTP Authentication

KarelDB supports both HTTP Basic Authentication and HTTP Digest Authentication, as shown below:

authentication.method=BASIC  # or DIGEST
authentication.roles=admin,developer,user
authentication.realm=KarelDb-Props  # as specified in JAAS file

In the above example, the JAAS file might look like

KarelDb-Props {
  org.eclipse.jetty.jaas.spi.PropertyFileLoginModule required
  file="/path/to/password-file"
  debug="false";
};

The ProperyFileLoginModule can be replaced with other implementations, such as LdapLoginModule or JDBCLoginModule.

When starting KarelDB, the path to the JAAS file must be set as a system property.

$ export KARELDB_OPTS=-Djava.security.auth.login.config=/path/to/the/jaas_config.file
$ bin/kareldb-start config/kareldb-secure.properties

When using the Avatica JDBC client, the avatica_user and avatica_password can be passed in the JDBC URL as specified here.

Kafka Authentication

Authentication to a secure Kafka cluster is described here.

Implementation Notes

KarelDB stores table data in topics of the form {tableName}_{generation}. A different generation ID is used whenever a table is dropped and re-created.

KarelDB uses three topics to hold metadata:

  • _tables - A topic that holds the schemas for tables.
  • _commits - A topic that holds the list of committed transactions.
  • _timestamps - A topic that stores the maximum timestamp that the transaction manager is allowed to return to clients.

Database by Components

KarelDB is an example of a database built mostly by assembling pre-existing components. In particular, KarelDB uses the following:

See this blog for more on the design of KarelDB.

Future Enhancements

Possible future enhancements include support for secondary indices.

More Repositories

1

hgraphdb

HBase as a TinkerPop Graph Database
Java
251
star
2

kcache

An In-Memory Cache Backed by Apache Kafka
Java
238
star
3

generator-angular-flask

Yeoman generator for AngularJS + Flask
Python
204
star
4

generator-angular-go-martini

Yeoman generator for AngularJS + Go + Martini
JavaScript
186
star
5

awesome-hbase

A curated list of awesome HBase projects and resources.
161
star
6

MicroFrameworkRosettaStone

A comparison of a number of web micro-frameworks via code generation
142
star
7

kafka-graphs

Graph Analytics with Apache Kafka
Java
101
star
8

generator-angular-express-sequelize

Yeoman generator for AngularJS + Express + Sequelize
JavaScript
81
star
9

generator-angular-dropwizard

Yeoman generator for AngularJS + Dropwizard
JavaScript
66
star
10

generator-angular-slim

Yeoman generator for AngularJS + Slim
JavaScript
61
star
11

generator-angular-scotty

Yeoman generator for AngularJS + Scotty
JavaScript
54
star
12

generator-angular-sinatra

Yeoman generator for AngularJS + Sinatra
JavaScript
44
star
13

kwack

In-Memory Analytics for Kafka using DuckDB
Java
36
star
14

generator-angular-spark

Yeoman generator for AngularJS + Spark
JavaScript
34
star
15

kdatalog

Kafka as a Datalog Engine
Java
27
star
16

kgiraffe

A GraphQL Interface for Apache Kafka and Schema Registry
Java
24
star
17

hdocdb

HBase as a JSON Document Database
Java
24
star
18

keta

A Transactional Metadata Store Backed by Apache Kafka
Java
19
star
19

generator-angular-nancy

Yeoman generator for AngularJS + Nancy
JavaScript
16
star
20

generator-angular-scalatra

Yeoman generator for AngularJS + Scalatra
JavaScript
16
star
21

generator-angular-luminus

Yeoman generator for AngularJS + Luminus
JavaScript
12
star
22

schema-registry-browser

Confluent Schema Registry Browser
Vue
12
star
23

generator-angular-mojolicious

Yeoman generator for AngularJS + Mojolicious
JavaScript
12
star
24

kafka-connect-streams

Kafka Connect Integration with Kafka Streams + KSQL
Java
11
star
25

stream-processing-kickstarter

A comparison of stream-processing frameworks with Kafka integration
Java
10
star
26

generator-angular-ratpack

Yeoman generator for AngularJS + Ratpack
JavaScript
10
star
27

janusgraph-kafka

Kafka storage adapter for JanusGraph
Java
9
star
28

generator-angular-dynamo

Yeoman generator for AngularJS + Dynamo
JavaScript
9
star
29

generator-angular-suave

Yeoman generator for AngularJS + Suave
JavaScript
8
star
30

kmachines

Distributed Fine-Grained Finite State Machines with Kafka
Java
8
star
31

generator-angular-caveman2

Yeoman generator for AngularJS + Caveman2
JavaScript
7
star
32

provision-angular-flask

Ansible provisioner for AngularJS + Flask
6
star
33

hentitydb

HBase as an Entity Database
Java
6
star
34

jsonata-python

JSONata for Python
Python
6
star
35

generator-angular-opium

Yeoman generator for AngularJS + Opium
JavaScript
6
star
36

provision-angular-go-martini

Ansible provisioner for AngularJS + Go + Martini
5
star
37

schema-registry-chess-engine

Confluent Schema Registry Chess Engine
Java
5
star
38

kafka-connect-jsonata

Kafka Connect JSONata Transform
Java
5
star
39

kstore

A Wide Column Store Backed by Apache Kafka
Java
4
star
40

schema-registry-mode-plugin

Confluent Schema Registry Subject Modes
Java
4
star
41

provision-angular-express-sequelize

Ansible provisioner for AngularJS + Express + Sequelize
3
star
42

generator-angular-nickel

Yeoman generator for AngularJS + Nickel
JavaScript
3
star
43

janusgraph-cosmosdb

The Azure Cosmos DB Storage Backend for JanusGraph
Java
3
star
44

generator-angular-axiom

Yeoman generator for AngularJS + Axiom
JavaScript
3
star
45

provision-angular-dropwizard

Ansible provisioner for AngularJS + Dropwizard
3
star
46

generator-angular-orbit

Yeoman generator for AngularJS + Orbit
JavaScript
2
star
47

kstore-shell

HBase Shell for KStore
Shell
2
star
48

cel.net

Common Expression Language for .NET
Starlark
2
star
49

generator-angular-spin

Yeoman generator for AngularJS + Spin
JavaScript
2
star
50

provision-angular-spark

Ansible provisioner for AngularJS + Spark
2
star
51

generator-angular-chinook

Yeoman generator for AngularJS + Chinook
JavaScript
2
star
52

json-schema-compatibility

Java
2
star
53

provision-angular-caveman2

Ansible provisioner for AngularJS + Caveman2
2
star
54

generator-angular-kitura

Yeoman generator for AngularJS + Kitura
JavaScript
2
star
55

provision-angular-scotty

Ansible provisioner for AngularJS + Scotty
1
star
56

generator-aurelia-dropwizard

Yeoman generator for Aurelia + Dropwizard
JavaScript
1
star
57

generator-angular-start

Yeoman generator for AngularJS + Start
JavaScript
1
star
58

provision-angular-luminus

Ansible provisioner for AngularJS + Luminus
1
star
59

maestro

A Dropwizard service for running orchestrations.
JavaScript
1
star
60

provision-angular-mojolicious

Ansible provisioner for AngularJS + Mojolicious
1
star
61

demo-data-contracts

Java
1
star