• This repository has been archived on 18/Feb/2021
  • Stars
    star
    640
  • Rank 70,324 (Top 2 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created about 8 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Kafka audit system

Chaperone

(This project is deprecated and not maintained.)

As Kafka audit system, Chaperone monitors the completeness and latency of data stream. The audit metrics are persisted in database for Kafka users to quantify the loss of their topics if any.

Basically, Chaperone cuts timeline into 10min buckets and assigns message to corresponding bucket according to its event time. The stats of the bucket are updated accordingly, like the total message count. Periodically, the stats are sent out to a dedicated Kafka topic, say 'chaperone-audit'. ChaperoneCollector consumes those stats from this topic and persists them into database.

============ Chaperone is made of several components:

  1. ChaperoneClient is a library that can be put in like Kafka Producer or Consumer to audit messages as they flow through. The audit stats are sent to a dedicated Kafka topic, say 'chaperone-audit'.
  2. ChaperoneCollector consumes audit stats from 'chaperone-audit' and persists them into database.
  3. ChaperoneService audits messages kept in Kafka. Since it's built upon uReplicator, it consists of two subsystems: ChaperoneServiceController to auto-detect topics in Kafka and assign the topic-partitions to workers to audit; ChaperoneServiceWorker to audit messages from assigned topic-partitions. In particular, ChaperoneService and ChaperoneCollector together ensure each message is audited exactly once.

Chaperone Quick Start

Get the Code

Check out the Chaperone project:

git clone [email protected]:uber/chaperone.git
cd chaperone

This project contains everything you’ll need to run Chaperone.

Build Chaperone

Before you can run Chaperone, you need to build a package for it.

mvn clean package

Or command below to skip tests

mvn clean package -DskipTests

Set Up Local Test Environment

To test Chaperone locally, you need two systems: Kafka, and ZooKeeper. The script β€œgrid” is to help you set up these systems.

  • The command below will download, install, and start ZooKeeper and Kafka (named cluster1)
bin/grid bootstrap

Start ChaperoneService

  • Start ChaperoneService Controller
./ChaperoneDistribution/target/ChaperoneDistribution-pkg/bin/start-chaperone-controller.sh
  • Start ChaperoneService Worker
./ChaperoneDistribution/target/ChaperoneDistribution-pkg/bin/start-chaperone-worker.sh

Generate Load

  • Create a dummyTopic in Kafka and produce some dummy data:
./bin/produce-data-to-kafka-topic-dummyTopic.sh
  • Check if the data is successfully produced to Kafka by console-consumer as below:
./deploy/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181/cluster1 --topic dummyTopic

You should get this data:

Kafka topic dummy topic data 1
Kafka topic dummy topic data 2
Kafka topic dummy topic data 3
Kafka topic dummy topic data 4
…

Check Audit Stats

In this example, the topic dummyTopic will be auto-detected and assigned to worker to audit. Periodically, the audit stats are sent to a topic called 'chaperone-audit'.

./deploy/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181/cluster1 --topic chaperone-audit 

One can also manually add topic to audit by command below:

curl -X POST -d '{"topic":"dummyTopic", "numPartitions":"1"}' http://localhost:9000/topics

Start ChaperoneCollector

To start ChaperoneCollector, MySQL is required and Redis is optional. MySQL is used to persist audit stats and Redis is used to deduplicate. Deduplication can be turned off. The configuration file for ChaperoneCollector is ./config/chaperonecollector.properties, which might be updated to connect to MySQL and Redis.

  • Start ChaperoneCollector
./ChaperoneDistribution/target/ChaperoneDistribution-pkg/bin/start-chaperone-collector.sh

More Repositories

1

go-torch

Stochastic flame graph profiler for Go programs
Go
3,958
star
2

pyflame

πŸ”₯ Pyflame: A Ptracing Profiler For Python. This project is deprecated and not maintained.
C++
2,974
star
3

image-diff

Create image differential between two images
JavaScript
2,453
star
4

makisu

Fast and flexible Docker image building tool, works in unprivileged containerized environments like Mesos and Kubernetes.
Go
2,409
star
5

cpustat

high frequency performance measurements for Linux. This project is deprecated and not maintained.
Go
1,659
star
6

cherami-server

Distributed, scalable, durable, and highly available message queue system. This project is deprecated and not maintained.
Go
1,416
star
7

AthenaX

SQL-based streaming analytics platform at scale
Java
1,224
star
8

plato-research-dialogue-system

This is the Plato Research Dialogue System, a flexible platform for developing conversational AI agents.
Python
977
star
9

npm-shrinkwrap

A consistent shrinkwrap tool
JavaScript
775
star
10

coding-challenge-tools

Uber's tools team coding challenge
562
star
11

hyperbahn

Service discovery and routing for large scale microservice operations
JavaScript
394
star
12

sql-differential-privacy

Dataflow analysis & differential privacy for SQL queries. This project is deprecated and not maintained.
Scala
391
star
13

phabricator-jenkins-plugin

Jenkins plugin to integrate with Phabricator, Harbormaster, and Uberalls
Java
367
star
14

ohana-ios

Contacts simplified. This project is deprecated and not maintained.
Objective-C
362
star
15

rave

A data model validation framework that uses java annotation processing.
Java
355
star
16

jetstream-ios

An elegant model framework written in Swift
Swift
333
star
17

node-stap

Tools for analyzing Node.js programs with SystemTap. This project is deprecated and not maintained.
JavaScript
291
star
18

r-dom

React DOM wrapper
JavaScript
263
star
19

focuson

A tool to surface security issues in python code
Python
226
star
20

cherami-client-go

Go Client Implementation of Cherami - A distributed, scalable, durable, and highly available message queue system. This project is deprecated and not maintained.
Go
207
star
21

viewport-mercator-project

NOTE: The viewport-mercator-project repo is archived and code has moved to
JavaScript
137
star
22

infer-plugin

Gradle plugin that allows easy integration with the infer static analyzer.
Groovy
126
star
23

express-statsd

Statsd route monitoring middleware for connect/express
JavaScript
126
star
24

android-build-environment

Docker repository for android build environment
122
star
25

in-n-out

A library to perform point-in-geofence searches.
JavaScript
106
star
26

buck-http-cache

An Implementation of Buck's HTTP Cache API as a distributed cache service. This project is deprecated and not maintained.
Shell
101
star
27

statsrelay

A consistent-hashing relay for statsd and carbon metrics
C
101
star
28

hacheck

HAproxy healthcheck proxying service
Python
86
star
29

potter

a CLI to create node.js services
JavaScript
83
star
30

opentracing-go

A general-purpose instrumentation API for distributed tracing systems
Go
82
star
31

idl

A CLI for managing Thrift IDL files
JavaScript
78
star
32

jetstream

Jetstream Sync server framework
JavaScript
73
star
33

canduit

Node.js Phabricator Conduit API client. This project is deprecated and not maintained.
JavaScript
65
star
34

kafka-spraynozzle

A nozzle to spray a kafka topic at an HTTP endpoint. This project is deprecated and not maintained.
Java
49
star
35

usb2fac

Enabling 2fac confirmation for newly connected USB devices
Python
44
star
36

nanny

Cluster management for Node processes
JavaScript
40
star
37

auto-value-bundle

Extends Autovalue to extract data from a bundle into a value object.
Java
36
star
38

node-flame

Tools for analyzing Node.js programs with ptrace. This project is deprecated and not maintained.
JavaScript
29
star
39

Bug-Bounty-Page

A repo to make our changes more transparent to bug bounty researchers in our program (so they can see commits, etc).
29
star
40

paranoid-request

An SSRF-preventing wrapper around Node's request module
JavaScript
26
star
41

lint-trap

JavaScript linter module for Uber projects
JavaScript
26
star
42

thriftify

JavaScript implementation of Thrift encoding and decoding
JavaScript
25
star
43

HackerOneAlchemy

A tool to generate statistics and help manage bug bounty reports in HackerOne.
Python
23
star
44

express-translate

Add simple translation support to Express
JavaScript
21
star
45

cherami-thrift

Thrift APIs for Cherami - A distributed, scalable, durable, and highly available message queue system. This project is deprecated and not maintained.
Go
20
star
46

h1-python

A HackerOne API client for Python
Python
19
star
47

cidrtrie

Trie implementation of a CIDR lookup table
Python
19
star
48

ios-template

This template provides a starting point for open source iOS projects at Uber.
Ruby
18
star
49

tcheck

TChannel health check utility
Go
17
star
50

job_progress

Store the progress of a job
Python
16
star
51

java-code-styles

IntelliJ IDEA code style settings for Uber's Java and Android projects.
15
star
52

fixed-server

Server for HTTP fixtures
JavaScript
14
star
53

vis-academy

A set of tutorials on how our frameworks make effective data visualization applications.
JavaScript
13
star
54

shared-docs

Shared Markdown Documents from Uber Engineering
12
star
55

typed-request-stack

Middleware stack runner for typed HTTP requests
JavaScript
11
star
56

cherami-client-python

Python Client for Cherami - A distributed, scalable, durable, and highly available message queue system. This project is deprecated and not maintained.
Python
11
star
57

failpointsjs

JavaScript
10
star
58

instafork

JavaScript
8
star
59

py-find-unicode

Find incorrect unicode() invocations
Python
8
star
60

shallow-settings

Shallow inheritance-based settings for your application
JavaScript
7
star
61

clusto-query

Silly CLI for querying clusto more quickly
Python
7
star
62

gg

Go dependency debugger
Go
7
star
63

connect-csrf-lite

CSRF validation middleware for Connect/Express
JavaScript
7
star
64

javax-extras

(DEPRECATED) Extra utilities for javax
Java
6
star
65

fixtures-fs

Create a temporary fs with JSON fixtures
JavaScript
6
star
66

redis-delete-pattern

Delete a set of keys from a pattern in Redis
6
star
67

opentracing-python

NOTE: This repository has been retired. The latest OpenTracing APIs can be found in the official repository.
Python
5
star
68

tchannel-gen

Scaffolding for new TChannel w/ Hyperbahn applications
JavaScript
5
star
69

node-dot-arcanist

Uber's .arcanist folder as an npm module
PHP
5
star
70

cherami-client-java

Java Client for Cherami. This project is deprecated and not maintained.
Java
5
star
71

pyrehol

Python wrapper for Firehol
Python
4
star
72

dubstep

This repo is DEPRECATED. See https://github.com/dubstepjs/core
JavaScript
4
star
73

ottr

Easy, robust end-to-end UI tests for web apps
JavaScript
3
star
74

clouseau

A Node.js performance profiler by Uber
JavaScript
3
star
75

vertica-aesgcm-udx

C++
2
star
76

stacked

Go
2
star
77

request-redis-cache

Make requests and cache them in Redis
JavaScript
2
star
78

nodesol-write

Kafka producer.
JavaScript
2
star
79

request-mocha

Request utilities for Mocha
JavaScript
2
star
80

UberBuilder

Make building flexible, immutable objects a simple task
Objective-C
2
star
81

uLeak

DEPRECATED: This is continued in https://github.com/behroozkhorashadi/uLeak
Java
2
star
82

fusion-orchestrate

Tools and scripts for working across multiple fusion repos at once
JavaScript
2
star
83

deck.gl-data-osm

OSM data for the data visualization library deck.gl examples (https://uber.github.io/deck.gl/#/)
1
star
84

uberclass-clouseau

A subclass of uberclass that adds profiling support
JavaScript
1
star
85

backbone-api-client

Backbone mixin built for interacting with API clients
JavaScript
1
star
86

fusion-release

Releases and verifies FusionJS packages
JavaScript
1
star
87

cache-redis

An ES6 Map-like cache with redis backing
JavaScript
1
star
88

redis-broadcast

Write redis commands to a set of redises efficiently
JavaScript
1
star