• Stars
    star
    108
  • Rank 321,259 (Top 7 %)
  • Language
    Go
  • License
    MIT License
  • Created about 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

TalariaDB is a distributed, highly available, and low latency time-series database for Presto

Talaria

TalariaDB is a distributed, highly available, and low latency time-series database that stores real-time data. It's built on top of Badger DB.
Blog: https://engineering.grab.com/big-data-real-time-presto-talariadb

Announcement

We've moved! This repository is no longer maintained actively. Feel free to contribute to Kelindar/Talaria that is maintained actively.

Grab has migrated to the latest build of that repository and will continue to contribute there.

About TalariaDB

In Grab, millions and millions of transactions and connections take place every day on our platform, which requires data-driven decision making. And these decisions need to be made based on real-time data. For example, an experiment might inadvertently cause a significant increase of waiting time for riders.

To overcome the challenge of retrieving information from large amounts of data, we designed and built TalariaDB. It addresses our need to query at least 2-3 terabytes of data per hour with predictable low query latency and low cost. Most importantly, it plays very nicely with the different tools’ ecosystems and lets us query data using SQL.

Architecture

alt text

The diagram above shows how TalariaDB ingests and serves data.

  • The upstream ETL pipeline prepares the ORC files and store them in the S3 bucket as input.
  • AWS SQS is created as the event notification service of the S3 bucket and notifies TalariaDB of any new object uploaded on S3.
  • Presto is connected to TalariaDB through Thrift since it implements PrestoThriftService https://prestodb.github.io/docs/current/connector/thrift.html.
  • AWS Route53 is used as DNS web service of TalariaDB, whose domain is registered on Presto cluster. The IP addresses of the DNS record is registered by TalariaDB using Gossip protocol in bootstrap phase.

Currently this project is currently highly coupled with AWS services like SQS, S3 and Route53. We will make these components (storage, DNS) pluggable and make TalariaDB useful for more generic case.

Quick Start

Preconditions

  • setup AWS profile and make sure your machine is accessible to AWS and has enough permission to read from S3, SQS and manipulate Route53 records.

Steps

  1. Set env vars
export X_TALARIA_CONF=(path-to-this-repo)/config-ci.json
  1. Edit config-ci.json with your own configurations (including AWS Route53 and SQS configs)
  2. Start application
go run (path-to-this-repo)/main.go

About Us

TalariaDB is maintained by:

License

TalariaDB is licensed under the --- (LICENSE.md)

More Repositories

1

front-end-guide

πŸ“š Study guide and introduction to the modern front end stack.
JavaScript
14,933
star
2

cocoapods-binary-cache

Ruby
451
star
3

cocoapods-pod-merge

Cocoapods plugin to merge pods used by your Xcode project, reducing the number of dynamic frameworks your app has to load on startup
Ruby
363
star
4

Grazel

A tool to migrate Android projects from Gradle to Bazel incrementally and automatically
Kotlin
246
star
5

engineering-blog

πŸ“ We write about our technologies and the problems we handle at scale.
Ruby
119
star
6

swift-leak-check

Swift
109
star
7

secret-scanner

Go
40
star
8

grab-bazel-common

Common rules and macros for Grab's Android projects built with Bazel.
Kotlin
35
star
9

hackathon

πŸ’» Official Grabathon websites
JavaScript
29
star
10

symphony

Go
28
star
11

grabplatform-sdk-js

GrabPlatform SDK in javascript
TypeScript
27
star
12

async

Go
24
star
13

grabplatform-sdk-android

GrabPlatform SDK for android
Kotlin
20
star
14

superapp-sdk

SDK for Grab SuperApp WebView.
JavaScript
15
star
15

grabplatform-sdk-ios

GrabPlatform SDK for iOS
Swift
12
star
16

GraphBEAN

Interaction-Focused Anomaly Detection on Bipartite Node-and-Edge-Attributed Graphs
Python
11
star
17

grab-query-traces

10
star
18

grabpay-merchant-sdk

Java
9
star
19

gosm

Gosm is a golang library which implements writing OSM pbf files.
Go
9
star
20

mobile-kit-bridge-sdk

SDK for web view bridges that offers unified method signatures for Android/iOS
TypeScript
7
star
21

grabplatform-sample

Comprehensive sample for GrabPlatform-related SDKs.
JavaScript
4
star
22

blogs

Accompanying source code for our engineering blog
Ruby
4
star
23

grabplatform-sdk-golang

GrabPlatform SDK for Golang
3
star
24

grabplatform-sdk-js-example

JavaScript
1
star
25

go-showdeps

Go
1
star