• Stars
    star
    558
  • Rank 79,819 (Top 2 %)
  • Language
    Java
  • License
    BSD 3-Clause "New...
  • Created almost 14 years ago
  • Updated over 10 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Distributed database specialized in exporting key/value data from Hadoop

Build Status

ElephantDB 0.5.1 (cascalog 2.x)

ElephantDB 0.4.5 (cascalog 1.x)

About

ElephantDB is a database that specializes in exporting key/value data from Hadoop. ElephantDB is composed of two components. The first is a library that is used in MapReduce jobs for creating an indexed key/value dataset that is stored on a distributed filesystem. The second component is a daemon that can download a subset of a dataset and serve it in a read-only, random-access fashion. A group of machines working together to serve a full dataset is called a ring.

Since ElephantDB server doesn't support random writes, it is almost laughingly simple. Once the server loads up its subset of the data, it does very little. This leads to ElephantDB being rock-solid in production, since there's almost no moving parts.

ElephantDB server has a Thrift interface, so any language can make reads from it. The database itself is implemented in Clojure.

An ElephantDB datastore contains a fixed number of shards of a "Local Persistence". ElephantDB's local persistence engine is pluggable, and ElephantDB comes bundled with local persistence implementations for Berkeley DB Java Edition and LevelDB. On the MapReduce side, each reducer creates or updates a single shard into the DFS, and on the server side, each server serves a subset of the shards.

ElephantDB support hot-swapping so that a live server can be updated with a new set of shards without downtime.

Questions

Google group: elephantdb-user

Introduction

Introduction to ElephantDB

Tutorials

TODO: Write an updated tutorial for ElephantDB 0.4.x

Using ElephantDB in MapReduce Jobs

ElephantDB is hosted at Clojars. Clojars is a maven repo that is trivially easy to use with maven or leiningen. You should use this dependency when using ElephantDB within your MapReduce jobs to create ElephantDB datastores. ElephantDB contains a module elephantdb-cascading which allows you to easily create datastores from your Cascading workflows. elephantdb-cascalog is available for use with Cascalog >= 1.10.1.

Deploying ElephantDB server

TODO: Documentation on how to deploy ElephantDB.

Running the EDB Jar

TODO: Documentation on how to run ElephantDB

More Repositories

1

storm

Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more
Java
8,834
star
2

cascalog

Data processing on Hadoop without the hassle.
Clojure
1,376
star
3

storm-starter

Learn to use Storm!
Java
942
star
4

storm-contrib

A collection of spouts, bolts, serializers, DSLs, and other goodies to use with Storm
Java
580
star
5

storm-deploy

One click deploy for Storm clusters on AWS
Clojure
517
star
6

dfs-datastores

Dead-simple vertical partitioning, compression, appends, and consolidation of data on a distributed filesystem.
Java
215
star
7

storm-kestrel

Library to use Kestrel as a spout within Storm
Java
134
star
8

kafka-deploy

Automated deploy for Kafka on AWS
Clojure
124
star
9

storm-mesos

Run Storm on top of the Mesos cluster resource manager
Java
68
star
10

nanny

A simple dependency management system for your projects.
Python
46
star
11

cascalog-contrib

Java
45
star
12

trident-memcached

Trident state implementation for Memcached
Java
41
star
13

cascalog-demo

A short Cascalog program that produces a simplified version of a Facebook-like news feed.
Clojure
26
star
14

basic-specter

Implementation of core of Specter without any optimizations – a reference to understand the basics of how Specter works
Clojure
23
star
15

cascading-batch-query

Optimized joins using bloom filters on Hadoop via Cascading.
Java
21
star
16

cascalog-workshop

Materials for Cascalog workshop
Clojure
18
star
17

elephantdb-cascalog

Seamless integration of ElephantDB with Cascalog
Clojure
18
star
18

trident-kafka

NOTE: This project has been moved into storm-kafka in storm-contrib
Java
15
star
19

elephantdb-cascading

Adapters to write to ElephantDB using Cascading
Java
13
star
20

specter-demo

Code for Strange Loop talk on Specter
Clojure
13
star
21

cascalog-conj

Code from my presentation of Cascalog at Clojure/conj 2011
Clojure
10
star
22

storm-website

Source for storm-project.net
CSS
7
star
23

thrift-dev

Apache Thrift + additional patches that I need
C++
6
star
24

specter-clojure-west

Clojure
6
star
25

swarm

Intense Space Invaders-like game with "terminal graphics"
C++
5
star
26

warzone

Turn based strategy game
Java
4
star
27

formula-inverse

A high-speed 3D racing game where the track can curve any which way and your car is bound to the track
C
4
star
28

specter-wiki

Repository for wiki of https://github.com/redplanetlabs/specter
4
star
29

cascalog-workshop-starter

Starter code for Cascalog workshop
Clojure
2
star
30

specter-presentation

Clojure
2
star