• Stars
    star
    1,442
  • Rank 32,643 (Top 0.7 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created almost 9 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Real-time Monitoring and Analysis of Data Streams

Confluo

Build Status License

Confluo is a system for real-time monitoring and analysis of data, that supports:

  • high-throughput concurrent writes of millions of data points from multiple data streams;
  • online queries at millisecond timescale; and
  • ad-hoc queries using minimal CPU resources.

Please find detailed documentation here.

Installation

Required dependencies:

  • MacOS X or Unix-based OS; Windows is not yet supported.
  • C++ compiler that supports C++11 standard (e.g., GCC 5.3 or later)
  • CMake 3.2 or later
  • Boost 1.58 or later

For python client, you will additionally require:

  • Python 2.7 or later
  • Python Packages: setuptools, six 1.7.2 or later

For java client, you will additionally require:

  • Java JDK 1.7 or later
  • ant 1.6.2 or later

Source Build

To download and install Confluo, use the following commands:

git clone https://github.com/ucbrise/confluo.git
cd confluo
mkdir build
cd build
cmake ..
make -j && make test && make install

Using Confluo

While Confluo supports multiple execution modes, the simplest way to get started is to start Confluo as a server daemon and query it using one of its client APIs.

To start the server daemon, run:

confluod --address=127.0.0.1 --port=9090

Here's some sample usage of the Python API:

import sys
from confluo.rpc.client import RpcClient
from confluo.rpc.storage import StorageMode

# Connect to the server
client = RpcClient("127.0.0.1", 9090)

# Create an Atomic MultiLog with given schema for a performance log
schema = """{
  timestamp: ULONG,
  op_latency_ms: DOUBLE,
  cpu_util: DOUBLE,
  mem_avail: DOUBLE,
  log_msg: STRING(100)
}"""
storage_mode = StorageMode.IN_MEMORY
client.create_atomic_multilog("perf_log", schema, storage_mode)

# Add an index
client.add_index("op_latency_ms")

# Add a filter
client.add_filter("low_resources", "cpu_util>0.8 || mem_avail<0.1")

# Add an aggregate
client.add_aggregate("max_latency_ms", "low_resources", "MAX(op_latency_ms)")

# Install a trigger
client.install_trigger("high_latency_trigger", "max_latency_ms > 1000")

# Load some data
off1 = client.append([100.0, 0.5, 0.9,  "INFO: Launched 1 tasks"])
off2 = client.append([500.0, 0.9, 0.05, "WARN: Server {2} down"])
off3 = client.append([1001.0, 0.9, 0.03, "WARN: Server {2, 4, 5} down"])

# Read the written data
record1 = client.read(off1)
record2 = client.read(off2)
record3 = client.read(off3)

# Query using indexes
record_stream = client.execute_filter("cpu_util>0.5 || mem_avail<0.5")
for r in record_stream:
  print r

# Query using filters
record_stream = client.query_filter("low_resources", 0, sys.maxsize)
for r in record_stream:
  print r

# Query an aggregate
print client.get_aggregate("max_latency_ms", 0, sys.maxsize)

# Query alerts generated by a trigger
alert_stream = client.get_alerts(0, sys.maxsize, "high_latency_trigger")
for a in alert_stream:
  print a

Contributing

Please create a GitHub issue to file a bug or request a feature. We welcome pull-requests, but request that you review the pull-request process before submitting one.

Please subscribe to the mailing list ([email protected]) for project announcements, and discussion regarding use-cases and development.

More Repositories

1

clipper

A low-latency prediction-serving system
C++
1,399
star
2

anna

Java
448
star
3

cs294-ai-sys-sp19

CS294; AI For Systems and Systems For AI
HTML
220
star
4

actnn

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training
Python
201
star
5

flor

🌻 Flow with FlorDB
Jupyter Notebook
151
star
6

graphtrans

Representing Long-Range Context for Graph Neural Networks with Global Attention
Python
122
star
7

cirrus

Serverless ML Framework
C++
107
star
8

piranha

Piranha: A GPU Platform for Secure Computation
C++
88
star
9

cpp_project_template

Minimal C++ Project Template
C++
38
star
10

risecamp

RISECamp Tutorials
Jupyter Notebook
38
star
11

jedi-pairing

Go, C++, and C implementation of a bilinear group and pairing-based cryptography for both embedded and non-embedded systems
C++
34
star
12

dory

Go
32
star
13

cs294-rise-fa16

CS294 RISE Course Material
32
star
14

cs294-ai-sys-fa19

CS294-162; Machine Learning Systems Seminar
HTML
31
star
15

hypersched

Deadline-based hyperparameter tuning on RayTune.
Python
31
star
16

LatticeFlow

C++
28
star
17

mage

MAGE: Memory-Aware Garbling Engine
C++
25
star
18

aws-audit

aws consolidated billing audit/reporting tool
Python
24
star
19

waldo

C++
23
star
20

snoopy

A high-throughput oblivious storage system
C
23
star
21

MerkleSquare

A Go library for MerkleSquare: A Low-Latency Transparency Log System
Go
20
star
22

clipper-tutorials

Jupyter Notebook
17
star
23

fluent-old

Bloom + C++
JavaScript
17
star
24

cs294-ai-sys-sp22

CS294 AI Systems Class Website
SCSS
15
star
25

jarvis

Build, configure, and track workflows with Jarvis.
Python
13
star
26

caravel

Studying GPU Multi-tenancy
Jupyter Notebook
12
star
27

jedi-protocol-go

Golang implementation of JEDI: Many-to-Many End-to-End Encryption and Key Delegation for IoT
Go
12
star
28

costco

Python
10
star
29

cs262a-fall2020

CS 262a Fall 2020
HTML
6
star
30

tcplp

Performant TCP for Low-Power Wireless Networks
5
star
31

mage-scripts

Benchmarking scripts for MAGE
Python
5
star
32

clipper-website

Hugo sources for clipper.ai website
CSS
4
star
33

kaggle-nlp-disasters

https://www.kaggle.com/c/nlp-getting-started
Jupyter Notebook
3
star
34

clipper-serving-testbed

Python
2
star
35

fluent_scala

Scala
2
star
36

netplay

C++
1
star
37

buddy

An Python/SQL DSL for CALM cloud programming
Python
1
star
38

flor-camp2018

Jupyter Notebook
1
star
39

maceta

Delta Pruning: pruning spurious/deleterious changes that arise organically from iterative model development
1
star