• Stars
    star
    189
  • Rank 197,618 (Top 5 %)
  • Language
    Java
  • License
    MIT License
  • Created almost 4 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Astra is a cloud-native search and analytics engine for log, trace, and audit data

KalDB

release version release pipeline license

KalDB is a cloud-native search and analytics engine for log, trace, and audit data. It is designed to easy to operate, cost-effective, and scale to petabytes of data.

Goals

  • Native support for log, trace, audit use cases.
  • Aggressively prioritize ingest of recent data over older data.
  • Full-text search capability.
  • Fist-class Kubernetes support for all components.
  • Autoscaling of ingest and query capacity.
  • Coordination free ingestion, so failure of a single node does not impact ingestion.
  • Works out of the box with sensible defaults.
  • Designed for zero data loss.
  • First-class Grafana support with accompanying plugin.
  • Built-in multi-tenancy, supporting several small use-cases on a single cluster.
  • Supports the majority of Apache Lucene features.
  • Drop-in replacement for most Opensearch log use cases.

Non-Goals

  • General-purpose search cases, such as for an ecommerce site.
  • Document mutability - records are expected to be append only.
  • Additional storage engines other than Lucene.
  • Support for JVM versions other than the current LTS.
  • Supporting multiple Lucene versions.

Quick Start

IntelliJ: Import the project as a Maven project.

IntelliJ run configs are provided for all node types, and execute using the provided config/config.yaml. These configurations are stored in the .run folder and should automatically be detected by IntelliJ upon importing the project.

To start KalDB and it's dependencies (Zookeeper, Kafka, S3) you can use the provided docker compose file:

docker-compose up

Index Data

  1. Data from the "test-topic-in" (preprocessorConfig/kafkaStreamConfig/upstreamTopics in config.yaml) Kafka topic is read as input by the preprocessor.
  2. The input data transformer "json" (preprocessorConfig/dataTransformer in config.yaml) is how the preprocessor will parse the data.
  3. Each document must contain 2 mandatory fields - "service_name" and "timestamp" (DateTimeFormatter.ISO_INSTANT)
  4. There needs to be a dataset entry for the incoming data that maps the incoming service name
  5. To create a dataset entry, go to the manager node (default http://localhost:8083/docs) and call CreateDatasetMetadata with name/owner as "test" and serviceNamePattern = "_all"
  6. Then we need to update partition assignment. For this we have to go to the manager node (default http://localhost:8083/docs) and call UpdatePartitionAssignment with name="test", throughputBytes=1000000 (1 MB/s after which messages will be dropped) and partitionIds=["0"] (the partition is a string and here we tell to only read from partition 0 of test-topic-in)
  7. Now we can start producing data to Kafka partiton=0 partition="test-topic-in"
  8. The preprocessor writes data into the following kafka topic "test-topic"(preprocessorConfig/downstreamTopic in config.yaml). We apply rate-limits etc.
  9. The indexer service is configured to read from "test-topic" (indexerConfig/kafkaConfig/kafkaTopic in config.yaml) and creates lucene indexes locally

Query via Grafana

http://localhost:3000/explore

Contributing

If you are interested in reporting/fixing issues and contributing directly to the code base, please see CONTRIBUTING for more information on what we're looking for and how to get started.

Community

Join our Slack community

Presentations

KalDB: A k8s native log search platform

Licensing

Licensed under MIT. Copyright (c) 2021 Slack.

More Repositories

1

nebula

A scalable overlay networking tool with a focus on performance, simplicity and security
Go
13,646
star
2

SlackTextViewController

⛔️**DEPRECATED** ⛔️ A drop-in UIViewController subclass with a growing text input view and other useful messaging features
Objective-C
8,332
star
3

PanModal

An elegant and highly customizable presentation API for constructing bottom sheet modals on iOS.
Swift
3,595
star
4

go-audit

go-audit is an alternative to the auditd daemon that ships with many distros
Go
1,541
star
5

circuit

⚡️ A Compose-driven architecture for Kotlin and Android applications.
Kotlin
1,250
star
6

EitherNet

A pluggable sealed API result type for modeling Retrofit responses.
Kotlin
730
star
7

goSDL

goSDL
PHP
516
star
8

slack-api-docs

API Docs for Slack.com
427
star
9

slack-gradle-plugin

Gradle and IntelliJ build tooling used in Slack's Android repo
Kotlin
418
star
10

compose-lints

Lint checks to aid with a healthy adoption of Compose
Kotlin
349
star
11

keeper

A Gradle plugin that infers Proguard/R8 keep rules for androidTest sources.
Kotlin
248
star
12

slack-lints

A collection of custom Android/Kotlin lint checks we use in our Android and Kotlin code bases at Slack.
Kotlin
207
star
13

magic-cli

Ruby
196
star
14

simple-kubernetes-webhook

This project is aimed at illustrating how to build a fully functioning kubernetes admission webhook in the simplest way possible.
Go
170
star
15

csp-html-webpack-plugin

A plugin which, when combined with HTMLWebpackPlugin, adds CSP tags to the HTML output.
JavaScript
158
star
16

hack-sql-fake

A library for testing database driven code in Hack
Hack
75
star
17

hakana

Another typechecker for Hack, built by Slack
Rust
70
star
18

vscode-hack

Hack language & HHVM debugger support for Visual Studio Code
TypeScript
70
star
19

gsuite-oauth-third-party-app-report

Start enforcing G Suite third-party apps via OAuth
JavaScript
54
star
20

backend-interview-prep-questions

A few questions & data to help you prepare for the Slack HQ backend interview
PLpgSQL
45
star
21

moshi-gson-interop

An interop tool for safely mixing Moshi and Gson models in JSON serialization.
Kotlin
43
star
22

kotlin-cli-util

Kotlin CLI utilities, mostly intended for use with Clikt
Kotlin
33
star
23

tree-sitter-hack

Hack grammar for tree-sitter
JavaScript
28
star
24

hack-json-schema

Generate Hack JSON Schema validators based on a JSON Schema.
Hack
27
star
25

deanimator

Go package that can detect animated images and "deanimate" them by rendering just the first frame as a static image.
Go
24
star
26

es-query-simple

A tiny command line utility to query elasticsearch. "
Python
23
star
27

auto-value-kotlin

An AutoValue extension that generates binary and source compatible equivalent Kotlin data classes of AutoValue models.
Kotlin
23
star
28

go-rsyslog-pstats

Parses and forwards rsyslog process stats to a local statsite, statsd, or wire protocol compatible service.
Go
21
star
29

tiny-thumb

Novel, efficient, and practical image compression with visually appealing results. 🤏 ✨
Go
14
star
30

backend-interview-prerequisites

A project to ensure that your backend onsite interview at Slack runs smoothly.
Go
11
star
31

sqlite-go-connect

A simple go app that connects to a sqlite3 database
Go
11
star
32

sqlite-python-connect

Short bit of code to connect to a sqlite db and run a query in python
Python
10
star
33

hack-graphql

Playground for a hack graphql server
Hack
8
star
34

protoc-gen-ts

A Typescript Protocol Buffer Implementation from the Future ✨
TypeScript
8
star
35

htmlsanitizer-hack

A port of the PHP HTML Purifier originally developed by Edward Z. Yang into Hacklang
Hack
7
star
36

sqlite-java-connect

This is a minimal repo project that connects to a sqlite3 database and returns a single row.
Java
6
star
37

grpc-hack

A gRPC extension for HHVM
C++
4
star
38

slack-astra-app

Grafana plugin that adds support for Astra
TypeScript
4
star
39

sqlite-ruby-connect

Just a tiny lil something to connect to SQLite using Ruby
PLpgSQL
3
star
40

proto-hack

hacklang generator for protobuf
Hack
3
star
41

snow

Python
2
star
42

.github

1
star
43

go-metrics-prometheus

Go
1
star
44

quota

1
star