• Stars
    star
    162
  • Rank 232,250 (Top 5 %)
  • Language
    Go
  • License
    MIT License
  • Created about 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Optimal Quantile Approximation in Streams

quantiles - Optimal Quantile Approximation in Streams

GoDoc CircleCI

This is a translation of TensorFlow's quantile helper class, it aims to compute approximate quantiles with error bound guarantees for weighted data sets. This implementation is an adaptation of techniques from the following papers:

The key ideas at play are the following:

  • Maintain an in-memory multi-level quantile summary in a way to guarantee a maximum approximation error of eps * W per bucket where W is the total weight across all points in the input dataset.
  • Two base operations are defined: MERGE and COMPRESS. MERGE combines two summaries guaranteeing a epsNew = max(eps1, eps2). COMPRESS compresses a summary to b + 1 elements guaranteeing epsNew = epsOld + 1/b.
  • b * sizeof(summary entry) must ideally be small enough to fit in an average CPU L2 cache.
  • To distribute this algorithm with maintaining error bounds, we need the worker-computed summaries to have no more than eps / h error where h is the height of the distributed computation graph which is 2 for an MR with no combiner.

We mainly want to max out IO bw by ensuring we're not compute-bound and using a reasonable amount of RAM.

Complexity:

  • Compute: O(n * log(1/eps * log(eps * n))).
  • Memory: O(1/eps * log^2(eps * n)) <- for one worker streaming through the entire dataset.

An epsilon value of zero would make the algorithm extremely inefficent and therefore, is disallowed.

Example Usage

package quantiles_test

import (
	"fmt"

	"github.com/axiomhq/quantiles"
)

func Example() {
	sketch := quantiles.NewDefault()
	for i := 0.0; i < 1e6; i++ {
		if err := sketch.Push(i, 1.0); err != nil {
			panic(err)
		}
	}
	fmt.Print("ApproximationError:") 	
	fmt.Println(sketch.ApproximationError(1))  // 0 <nil>

	fmt.Print("Finalize:") 
	fmt.Println(sketch.Finalize())            // <nil>

 
	fmt.Print("GenerateQuantiles(4):")         
	fmt.Println(sketch.GenerateQuantiles(4))  // [0 251865 503730 746595 999999] <nil>


	fmt.Print("GenerateQuantiles(10):")
	fmt.Println(sketch.GenerateQuantiles(10)) // [0 98946 197892 296838 395789 503730 602676 701622 800568 899514 999999] <nil>

	sum, err := sketch.FinalSummary()
	if err != nil {
		panic(err)
	}
	fmt.Print("GenerateQuantiles(4):")
	fmt.Println(sum.GenerateQuantiles(4))     // [0 251865 503730 746595 999999]
}

TODO

  • Implement an online estimator without the need of finalizing the stream
  • Add proper documentation
  • Benchmark
  • Add serialization

More Repositories

1

hyperloglog

HyperLogLog with lots of sugar (Sparse, LogLog-Beta bias correction and TailCut space reduction) brought to you by Axiom
Go
920
star
2

next-axiom

The official Next.js library for Axiom.
TypeScript
353
star
3

hyperminhash

HyperMinHash: Bringing intersections to HyperLogLog
Go
302
star
4

rust-cuckoofilter

Cuckoo Filter: Practically Better Than Bloom (In Rust)
Rust
270
star
5

axiom-js

Official language bindings and library extensions for Axiom
TypeScript
85
star
6

zig-hyperloglog

Zig library for HyperLogLog estimation
Zig
82
star
7

axiom-node

[DEPRECATED] Use axiomhq/axiom-js instead.
TypeScript
78
star
8

axiom-go

Official Go bindings for the Axiom API
Go
47
star
9

cli

The power of Axiom on the command line.
Go
41
star
10

flipcounter

A counter data structure that knows when to start estimating to save space
Go
36
star
11

axiom-rs

Official Rust bindings for the Axiom API
Rust
31
star
12

axiom-py

The official Python bindings for the Axiom API
Python
23
star
13

axiom-cloudflare-workers

Send logs from Cloudflare Workers to Axiom
JavaScript
22
star
14

tracing-axiom

The official Rust tracing layer for Axiom
Rust
18
star
15

puppeteer-request-intercepter

Intercept API Requests and return Mocked Data
TypeScript
16
star
16

prisma-axiom

Axiom observability for Prisma
TypeScript
16
star
17

variance

Go implementation of variance's method for one-pass variance computation with D. H. D. West improved methods which features merging of several multiple sets of statistics and adding weighted values.
Go
16
star
18

hypertwobits

HyperTwoBits implementation
Rust
12
star
19

awesome-axiom

A curated list of awesome Axiom Platform, libraries, open source repos, guides, blogs, Documentation and other resources.
12
star
20

axiom-demo

Take a look at Axiom on your local machine.
Shell
11
star
21

ngbuild

Dream builders ๐Ÿ˜ด๐Ÿ’ญ
Go
9
star
22

axiom-lambda-extension

Ingest logs and platform events from your lambda functions
Go
9
star
23

axiom-syslog-proxy

A syslog push interface to Axiom.
Go
8
star
24

axiom-honeycomb-proxy

A log forwarder/multiplexer for Axiom and Honeycomb.
Go
7
star
25

deno-client

Minimal deno library for sending events and logs to Axiom
TypeScript
6
star
26

axiom-ai

The official package to send events from AI libraries to Axiom
TypeScript
6
star
27

axiom-grafana

The official Axiom datasource plugin for Grafana.
TypeScript
5
star
28

splitblockbloom

Go
5
star
29

pkg

Commonly used Go packages for Axiom projects.
Go
4
star
30

axiom-loki-multiplexer

A push interface to Axiom via Loki endpoint.
Go
4
star
31

axiom-elements

TypeScript
4
star
32

topkapi

Topkapi: Parallel and Fast Sketches for Finding Top-K Frequent Elements
Go
3
star
33

axiom-cloudwatch-forwarder

Forward CloudWatch Logs to Axiom.
Python
3
star
34

randhn

Random new or top 500 HN story
TypeScript
3
star
35

setup-axiom

Set up a local Axiom stack for testing your integration.
JavaScript
2
star
36

annotation-action

This action allows you to create an annotation in Axiom.
TypeScript
2
star
37

terraform-provider-axiom

Axiom Terraform Provider
Go
2
star
38

homebrew-tap

Collection of Homebrew formulas for Axiom, Inc. open source projects.
Ruby
1
star
39

golang-sync-bench

Benchmarking Synchronization Primitives in Go
Go
1
star
40

axiom-nomad

Axiom ๐Ÿค HashiCorp
HCL
1
star
41

axiom-helm-charts

Axiom Helm Charts
Smarty
1
star
42

logmanager

Yet another Go logging library.
Go
1
star
43

terraform-aws-axiom

Setup required infrastructure and install Axiom Enterprise on AWS using Terraform.
1
star
44

monaco-kusto

JavaScript
1
star
45

terraform-aws-axiom-cloudwatch-forwarder

Official Terraform modules for Axiom Cloudwatch Forwarder
HCL
1
star