• Stars
    star
    255
  • Rank 153,943 (Top 4 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PromHouse is a long-term remote storage with built-in clustering and downsampling for Prometheus 2.x on top of ClickHouse.

PromHouse

Build Status codecov Go Report Card CLA assistant

PromHouse is a long-term remote storage with built-in clustering and downsampling for 2.x on top of ClickHouse. Or, rather, it will be someday. Feel free to like, share, retweet, star and watch it, but do not use it in production yet.

Database Schema

CREATE TABLE time_series (
    date Date CODEC(Delta),
    fingerprint UInt64,
    labels String
)
ENGINE = ReplacingMergeTree
    PARTITION BY date
    ORDER BY fingerprint;

CREATE TABLE samples (
    fingerprint UInt64,
    timestamp_ms Int64 CODEC(Delta),
    value Float64 CODEC(Delta)
)
ENGINE = MergeTree
    PARTITION BY toDate(timestamp_ms / 1000)
    ORDER BY (fingerprint, timestamp_ms);
SELECT * FROM time_series WHERE fingerprint = 7975981685167825999;
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€dateโ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€fingerprintโ”€โ”ฌโ”€labelsโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ 2017-12-31 โ”‚ 7975981685167825999 โ”‚ {"__name__":"up","instance":"promhouse_clickhouse_exporter_1:9116","job":"clickhouse"} โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
SELECT * FROM samples WHERE fingerprint = 7975981685167825999 LIMIT 3;
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€fingerprintโ”€โ”ฌโ”€โ”€timestamp_msโ”€โ”ฌโ”€valueโ”€โ”
โ”‚ 7975981685167825999 โ”‚ 1514730532900 โ”‚     0 โ”‚
โ”‚ 7975981685167825999 โ”‚ 1514730533901 โ”‚     1 โ”‚
โ”‚ 7975981685167825999 โ”‚ 1514730534901 โ”‚     1 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Time series in Prometheus are identified by label name/value pairs, including __name__ label, which stores metric name. Hash of those pairs is called a fingerprint. PromHouse uses the same hash algorithm as Prometheus to simplify data migration. During the operation, all fingerprints and label name/value pairs a kept in memory for fast access. The new time series are written to ClickHouse for persistence. They are also periodically read from it for discovering new time series written by other ClickHouse instances. ReplacingMergeTree ensures there are no duplicates if several ClickHouses wrote the same time series at the same time.

PromHouse currently stores 24 bytes per sample: 8 bytes for UInt64 fingerprint, 8 bytes for Int64 timestamp, and 8 bytes for Float64 value. The actual compression rate is about 4.5:1, that's about 24/4.5 = 5.3 bytes per sample. Prometheus local storage compresses 16 bytes (timestamp and value) per sample to 1.37, that's 12:1.

Since ClickHouse v19.3.3 it is possible to use delta and double delta for compression, which should make storage almost as efficient as TSDB's one.

Outstanding features in the roadmap

  • Downsampling (become possible since ClickHouse v18.12.14)
  • Query Hints (become possible since prometheus PR 4122, help wanted issue #24)

SQL queries

The largest jobs and instances by time series count:

SELECT
    job,
    instance,
    COUNT(*) AS value
FROM time_series
GROUP BY
    visitParamExtractString(labels, 'job') AS job,
    visitParamExtractString(labels, 'instance') AS instance
ORDER BY value DESC LIMIT 10

The largest metrics by time series count (cardinality):

SELECT
    name,
    COUNT(*) AS value
FROM time_series
GROUP BY
    visitParamExtractString(labels, '__name__') AS name
ORDER BY value DESC LIMIT 10

The largest time series by samples count:

SELECT
    labels,
    value
FROM time_series
ANY INNER JOIN
(
    SELECT
        fingerprint,
        COUNT(*) AS value
    FROM samples
    GROUP BY fingerprint
    ORDER BY value DESC
    LIMIT 10
) USING (fingerprint)

More Repositories

1

tpcc-mysql

C
455
star
2

sysbench-tpcc

Sysbench scripts to generate a tpcc-like workload for MySQL and PostgreSQL
Lua
279
star
3

mongodb_consistent_backup

A tool for performing consistent backups of MongoDB Clusters or Replica Sets
Python
273
star
4

mysql_random_data_load

MySQL random data loader
Go
269
star
5

clickhousedb_fdw

PostgreSQL's Foreign Data Wrapper For ClickHouse
C
198
star
6

query-playback

Query Playback
C++
95
star
7

ontime-airline-performance

Shell
95
star
8

pacemaker-replication-agents

Repository of the Percona Pacemaker resource agents
Shell
77
star
9

pg_tde

C
66
star
10

percona-openshift

Set of scripts to run Percona software in OpenShift / Kubernetes
Shell
54
star
11

proxysql-docker

Shell
44
star
12

pxc-docker

All docker related code for PXC
Shell
39
star
13

tuned-percona-mongodb

A performance-focused tuned profile for MongoDB on CentOS/Redhat Linux
Makefile
37
star
14

coredumper

Google coredumper library
C
33
star
15

pxc-proxysql-k8s

Shell
32
star
16

benchmark-results

23
star
17

jenkins-pipelines

Groovy
21
star
18

percona-xtradb-cluster-docker

Shell
21
star
19

tpce-mysql

C++
19
star
20

grafana_mongodb_dashboards

Dashboard for using Grafana and prometheus_mongodb_exporter
17
star
21

prom-migrate

prom-migrate reads all data from Prometheus 1.8 via API and creates a new Prometheus 2.0 storage directory.
Go
16
star
22

sysbench-mongodb-lua

Lua
14
star
23

pmm-submodules

A repo dedicated to building Percona Monitoring and Management (PMM)
Python
12
star
24

pt-mysql-config-diff

A tool like pt-config-diff written in Go
Go
11
star
25

serverless-postgresql-build

Shell
11
star
26

terraform-provider-percona

Terraform modules to deploy Percona Server and Percona XtraDB Cluster
Go
11
star
27

go-tpcc

Go
10
star
28

pt-pmp

Shell
9
star
29

proxysql-ha-experiments

Shell
8
star
30

pmm-custom-queries

Custom queries for Percona Monitoring and Management (PMM)
Shell
8
star
31

tokumx2_to_psmdb3_migration

Instructions and scripts to facilitate migration from TokuMX 2.0.x to PSMDB 3.0.x
JavaScript
7
star
32

percona-images

Packer config to build Percona base boxes
Shell
7
star
33

libeatmydata

libeatmydata library and packaging
M4
7
star
34

slowlog2clickhouse

Parse MySQL Slow log and save into ClickHouse table
Go
7
star
35

k8s-lab

HCL
6
star
36

autotokubackup

AutoTokBackup: A tokubackup commandline tool for running Percona TokuBackup written in Python3
Python
6
star
37

mysql-configs

5
star
38

ognom-toolkit

Go
5
star
39

mysql-group-replication-docker

Shell
5
star
40

percona-dbaas-cli

Go
5
star
41

ps-build

Collection of MySQL build scripts
Shell
4
star
42

proxysql-scheduler

Shell
4
star
43

pmm-ruled

Rules Daemon
Go
4
star
44

group_replication_tools

4
star
45

percona-millipede

Multi-host, sub-second replication delay monitor
Python
4
star
46

mnogo_exporter

Moved to https://github.com/percona/mongodb_exporter (branch exporter_v2)
Go
4
star
47

percona-version-service

Go
3
star
48

pt-mongodb-summary

pt-mongodb-summary
Go
3
star
49

MetricBench

C++
3
star
50

toolkit-tests

Docker container to run Percona Toolkit tests
Shell
3
star
51

minimum_permissions

Get the minimum set of permissions needed to run a particular query
Go
3
star
52

codeceptjs-saucehelper

CodeceptJS Sauce Labs helpers, to update Test Names, Test Results after test execution
JavaScript
2
star
53

codeceptjs-influxdbhelper

CodeceptJS helper to collect Test Execution Metrics with the help of CodeceptJS test events
JavaScript
2
star
54

redo_log_dumper

POC for a innodb redo log dumper
Go
2
star
55

text2json

POC for a text (pt-summary / pt-mysql-summary) reports to json converter
Go
2
star
56

sysbench-blob

Lua
2
star
57

PLG

Recording and replaying exporters
Go
2
star
58

benchmark_automation

Scripts to help automate the running of repeatable benchmarks
Shell
2
star
59

mongodb-fingerprint

Go
2
star
60

pmm-dashboards

PMM compatible dashboards or dashboards done during webinars or presentations
Shell
2
star
61

pmm-api

Moved to https://github.com/percona/pmm
HTML
2
star
62

serverless-postgresql-ansible

Ansible playbook to deploy serverless PostgreSQL
Jinja
2
star
63

wsrpc

Early prototype; dead end
Go
2
star
64

pmm-workloads

Various Workloads to Test and Demo Percona Monitoring and Management (PMM)
PHP
2
star
65

percona-binlog-server

Percona Binary Log Server
C++
2
star
66

star-schema-benchmark

Scala
1
star
67

sst-bench

Shell
1
star
68

sanitizer

POC for a log sanitizer for pt-stalk, pt-mysql-summary, pt-summary
Go
1
star
69

operator-env

Go
1
star
70

vitess-tpcc-lab

Lua
1
star
71

pmm-build

Early prototype; dead end
Go
1
star
72

percona-server-mongodb-openshift

Shell
1
star
73

wikistat-data

Python
1
star
74

mysql-kubernetes-openshift

1
star
75

pmm-client-docker

Shell
1
star
76

install-repo-pmm-server

Shell
1
star
77

pmm-api-tests

API tests for PMM 2.x.
Go
1
star
78

eng-scripts

Scripts from MySQL Engineering Team
Shell
1
star
79

mongodb_systemd_multi

Setup of script for multiple instances in a single host for mongodb (designed for testing) or systemd mongo for mongos, arbiters, config servers and mongod's
Shell
1
star
80

percona-mixins

Jsonnet
1
star
81

percona-on-arm

Unofficial builds
Shell
1
star
82

sysbench-mongodb-loop

A wrapper to run sysbench-mongodb forever
Shell
1
star
83

procfs

C++
1
star
84

visualize-mysql-queries

visualize mysql queries (based on Performance Schema)
JavaScript
1
star
85

rdsosmetrics_exporter

PROOF OF CONCEPT! export RDS Enhanced Monitoring metrics from CloudWatch Logs for prometheus
Go
1
star
86

MyRocks-benchmark

Scripts used too perform MyRocks specific benchmarks, forked from mdcallag/sysbench
1
star
87

single_install

This repo provides a single click installation script to Percona Products
Shell
1
star