• Stars
    star
    306
  • Rank 136,456 (Top 3 %)
  • Language
    Go
  • License
    MIT License
  • Created over 8 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for comparison write ups of InfluxDB and other solutions

influxdb-comparisons

This repo contains code for benchmarking InfluxDB against other databases and time series solutions. You can access the detailed technical writeups for each here.

Current databases supported:

Testing Methodology

In an attempt to make our performance comparison both realistic and relatable, we decided to build our benchmark suite according to real-world use cases. Micro-benchmarks are useful for database engineers, but using realistic data helps us better understand how our software performs under practical workloads.

Currently, the benchmarking tools focus on the DevOps use case. We create data and queries that mimic what a system administrator would see when operating a fleet of hundreds or thousands of virtual machines. We create and query values like CPU load; RAM usage; number of active, sleeping, or stalled processes; and disk used. Future benchmarks will expand to include the IoT and application monitoring use cases.

We benchmark bulk load performance and synchronous query execution performance. The benchmark suite is written in Go, and attempts to be as fair to each database as possible by removing test-related computational overhead (by pre-generating our datasets and queries, and using database-specific drivers where possible).

Although the data is randomly generated, our data and queries are entirely deterministic. By supplying the same PRNG (pseudo-random number generator) seed to the test generation code, each database is loaded with identical data and queried using identical queries.

(Note: The use of more than one worker thread does lead to a non-deterministic ordering of events when writing and/or querying the databases.)

There are five phases when using the benchmark suite: data generation, data loading, query generation, query execution, and query validation.

Phase 1: Data generation

Each benchmark begins with data generation.

The DevOps data generator creates time series points that correspond to server telemetry, similar to what a server fleet would send at regular intervals to a metrics collections service (like Telegraf or collectd). Our DevOps data generator runs a simulation for a pre-specified number of hosts, and emits serialized points to stdout. For each simulated machine, nine different measurements are written in 10-second intervals.

The intended usage of the DevOps data generator is to create distinct datasets that simulate larger and larger server fleets over increasing amounts of time. As the host count or the time interval go up, the point count increases. This approach lets us examine how the databases scale on a real-world workload in the dimensions our DevOps users care about.

Each simulated host is initialized with a RAM size and a set of stateful probability distributions (Gaussian random walks with clamping), corresponding to nine statistics as reported by Telegraf. Here are the Telegraf collectors for CPU and memory:

https://github.com/influxdata/telegraf/blob/master/plugins/inputs/system/cpu.go https://github.com/influxdata/telegraf/blob/master/plugins/inputs/system/memory.go

For example, here is a graph of the simulated CPU usage through time for 10 hosts, when using the data generator:

(TODO screenshot of graph from Chronograf)

And, here is a graph of the simulated memory from the same simulation:

(TODO screenshot of graph from Chronograf)

Note that the generator shares its simulation logic between databases. This is not just for code quality; we did this to ensure that the generated data is, within floating point tolerances, exactly the same for each database.

A DevOps dataset is fully specified by the following parameters: Number of hosts to simulate (default 1) Start time (default January 1st 2016 at midnight, inclusive) End time (default January 2nd 2016 at midnight, exclusive) PRNG seed (default uses the current time)

The โ€˜scaling variableโ€™ for the DevOps generator is the number of hosts to simulate. By default, the data is generated over a simulated period of one day. Each simulated host produces nine measurements per 10-second epoch, one each of:

  • cpu
  • diskio
  • disk
  • kernel
  • mem
  • net
  • nginx
  • postgresl
  • redis

Each measurement holds different values that are being stored. In total, all nine measurements store 100 field values.

The following equations describe how many points are generated for a 24 hour period:

seconds_in_day = (24 hours in a day) * (60 minutes in an hour) * (60 seconds in a minute) = 86,400 seconds
epochs = seconds_in_day / 10 = 8,640
point_count = epochs * host_count * 9

So, for one host we get 8,640 * 1 * 9 = 77,760 points, and for 1,000 hosts we get 8,640 * 1000 * 9 = 77,760,000 points.

For these benchmarks, we generated a dataset we call DevOps-100: 100 simulated hosts over various time periods (1-4 days).

Generated data is written in a database-specific format that directly equates to the bulk write protocol of each database. This helps make the following benchmark, bulk loading, as straightforward as possible.

For InfluxDB, the bulk load protocol is described at: https://docs.influxdata.com/influxdb/v0.12/guides/writing_data/#writing-multiple-points

For Elasticsearch, the bulk load protocol is described at: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html

For Cassandra, the native protocol version 4 described at: https://github.com/apache/cassandra/blob/trunk/doc/native_protocol_v4.spec

For MongoDB, we use standard BSON with the mgo client: http://labix.org/mgo

For OpenTSDB, we use the standard HTTP query interface (not the batch input tool) described at: http://opentsdb.net/docs/build/html/api_http/put.html

Phase 2: Data loading

After data generation comes data loading.

The data loading programs stream data from stdin; typically, this is from a file created by the data generator. As data is read, the loader performs a minimum of deserialization and queues up writes into a batch. As batches become ready, the points are loaded into the destination database as fast as possible.

(Each database currently has its own bulk loader program. In the future, we want to merge the programs together to minimize the amount of special-case code.)

Configuration

Each bulk loader takes a handful of parameters that affect performance:

ElasticSearch: Number of workers to use to make bulk load writes in parallel, Which index template to use (more on this later), Whether to force an index refresh after each write, and How many items to include in each write batch.

InfluxDB: Number of workers to use to make bulk load writes in parallel, and How many points to include in each write batch.

Loader programs for the other databases take similar parameters.

(For calibration, there is also an option to disable writing to the database; this mode is used to check the speed of data deserialization.)

Note that the bulk loaders will not start writing data if there is already data in the destination database at the beginning of a test. This helps ensure that the database is empty, as if it were newly-installed. It also prevents users from clobbering existing data.

Elasticsearch-specific configuration

Both Elasticsearch and InfluxDB are ready out-of-the-tarball for storing time series data. However, after meeting with Elasticsearch experts, we decided to make some reasonable configuration tweaks to Elasticsearch to try to optimize its performance.

First, the configuration for the Elasticsearch daemon was changed to set the ES_HEAP_SIZE environment variable to half of the server machineโ€™s available RAM. For example, on a 32GB machine, ES_HEAP_SIZE is 16g. This is standard practice when administering Elasticsearch.

Second, the configuration file was also changed to increase the threadpool.bulk.queue_size parameter to 100000. When we tried bulk loading without this tweak, the server replied with errors indicating it had run out of buffer space for receiving bulk writes. This config change is standard practice for bulk write workloads.

Third, we developed two Elasticsearch index templates, each of which represents a way we think people use Elasticsearch to store time-series data:

The first template, called โ€˜defaultโ€™, stores time-series data in a way that enables fast querying, while also storing the original document data. This is closest to Elasticsearchโ€™s default behavior and is a reasonable starting point for most users, although its on-disk size may become large.

The second template, called โ€˜aggregationโ€™, indexes time-series data in a way that saves disk space by discarding the original point data. All data is stored in a compressed form inside the Lucene indexes, therefore all queries are completely accurate. But, due to an implementation detail of Elastic, the underlying point data is no longer independently addressable. For users who only conduct aggregation queries, this saves quite a bit of disk space (and improves bulk load speed) without any downsides.

Fourth, after each bulk load in Elasticsearch, we trigger a forced compaction of all index data. This is not included in the speed measurements; we give this to Elasticsearch โ€˜for freeโ€™. Weโ€™ve chosen to do this because compactions occur continuously over the lifetime of a long-running Elasticsearch process, so this helps us obtain numbers that are representative of steady-state operation of Elasticsearch in production environments.

(Note that Elasticsearch does not immediately index data written with the bulk endpoint. To make written data immediately available for querying, users can set the URL query parameter โ€˜refreshโ€™ to โ€˜trueโ€™. We didnโ€™t do this because performance dropped considerably, and most users would not need this when performing a bulk load. InfluxDB performs an fsync after each bulk write, and makes data immediately available for querying.)

InfluxDB-specific configuration

The only change we made to a default InfluxDB install is to, like Elastic, cause a full database compaction after a bulk load benchmark is complete. This forces all eventual compaction to happen at once, simulating steady-state operation of the data store.

Measurements

For bulk loading, we care about two numerical outcomes: the total wall clock time taken to write the given dataset, and how much disk space is used by the database after all writes are complete.

When finished, the bulk load program prints out how long it took to load data, and what the average ingestion rate was.

Combining the following parameters gives a hypothetical โ€˜performance matrixโ€™ for a given dataset:

Client parallelism: 1, 2, 4, 8, 16
Database: InfluxDB, Elasticsearch (with default template), Elasticsearch (with aggregation template)

Which gives a possible set of 15 bulk write benchmarks. Running all these tests is excessive, but it is possible and allows us to confidently determine how both write throughput and disk usage scale.

Phase 3: Query generation

The third phase makes serialized queries and saves them to a file.

We pre-generate all queries before benchmarking them, so that the query benchmarker can be as lightweight as possible. This allows us to reuse code between the database drivers. It also lets us prove that the runtime overhead of query generation does not impact the benchmarks.

Many benchmark suites generate and serialize queries at the same time as running benchmarks; this is typically a mistake. For example, Elasticsearch takes queries in JSON format, yet InfluxDB has a simpler wire format. If we included query generation in the query benchmarker, then the JSON serialization overhead would negatively, and unfairly, affect the Elasticsearch benchmark.

(In the case of JSON this effect is especially acute: the JSON encoder in Goโ€™s standard library makes many heap allocations and uses reflection.)

The DevOps use case is focused on relating to the the needs of system administrators. As we saw above, the data for our benchmark is telemetry from a simulated server fleet.

The queries that administrators tend to run are focused on: 1) visualizing information on dashboards, 2) identifying trends in system utilization, and 3) drilling down into a particular serverโ€™s behavior.

To that end, we have identified the following query types as being representative of a sysadminโ€™s needs:

Maximum CPU usage for 1 host, over the course of an hour, in 1 minute intervals
Maximum CPU usage for 2 hosts, over the course of an hour, in 1 minute intervals
Maximum CPU usage for 4 hosts, over the course of an hour, in 1 minute intervals
Maximum CPU usage for 8 hosts, over the course of an hour, in 1 minute intervals
Maximum CPU usage for 16 hosts, over the course of an hour, in 1 minute intervals
Maximum CPU usage for 32 hosts, over the course of an hour, in 1 minute intervals

Each of these six abstract query types are parameterized to create millions of concrete queries, which are then serialized to files. (For example, the max CPU query for one host will be parameterized on 1) a random host id, and 2) a random 60-minute interval.) These requests will be read by the query benchmarker and then sent to the database.

Our query generator program uses a deterministic random number generator to fill in the parameters for each concrete query.

For example, here are two queries for InfluxDB that aggregate maximum CPU information for 2 hosts during a random 1-hour period, in 1 minute buckets. Each hostname was chosen from a set of 100 hosts, because in this example the Scaling Variable is 100:

SELECT max(usage_user) FROM cpu WHERE (hostname = 'host_73' OR hostname = 'host_24') AND time >= '2016-01-01T19:24:45Z' AND time < '2016-01-01T20:24:45Z' GROUP BY time(1m)
SELECT max(usage_user) FROM cpu WHERE (hostname = 'host_60' OR hostname = 'host_79') AND time >= '2016-01-01T11:14:49Z' AND time < '2016-01-01T12:14:49Z' GROUP BY time(1m)

Notice that the time range is always 60 minutes long, and that the start of the time range is randomly chosen.

The result of the query generation step is two files of serialized queries, one for each database.

Phase 4: Query execution

The final step is benchmarking query performance.

So far we have covered data generation, data loading, and query generation. Now, all of that culminates in a benchmark for each database that measures how fast they can satisfy queries.

Our query benchmarker is a small program that executes HTTP requests in parallel. It reads pre-generated requests from stdin, performs a minimum of deserialization, then executes those queries against the chosen endpoint. It supports making requests in parallel, and collects basic summary statistics during its execution.

The query benchmarker has zero knowledge of the database it is testing; it just executes HTTP requests and measures the outcome.

We use the fasthttp library for the HTTP client, because it minimizes heap allocations and can be up to 10x faster than Goโ€™s default client.

Before every execution of the query benchmarker, we restart the given database daemon in order to flush any query caches.

Phase 5: Query validation

The final step is to validate the benchmark by sampling the query results for both databases.

The benchmark suite was engineered to be fully deterministic. However, that does not guard against possible semantic mistakes in the data or query set. For example, queries for one database could be valid, yet wrong, if they compute an undesired result.

To show the parity of both data and queries between the databases, we can compare the query responses themselves.

Our query benchmarker tool has a mode for pretty-printing the query responses it receives. By running it in this mode, we can inspect query results and compare the results for each database.

For example, here is a side-by-side comparison of the responses for the same query (a list of maximums, in 1-minute buckets):

InfluxDB query response:

{
  "results": [
    {
      "series": [
        {
          "name": "cpu",
          "columns": [
            "time",
            "max"
          ],
          "values": [
            [
              "2016-01-01T18:29:00Z",
              90.92765387779365
            ],
            [
              "2016-01-01T18:30:00Z",
              89.58087379178397
            ],
            [
              "2016-01-01T18:31:00Z",
              88.39341429374308
            ],
            [
              "2016-01-01T18:32:00Z",
              84.27665178871197
            ],
            [
              "2016-01-01T18:33:00Z",
              84.95048030509422
            ],
            ...

Elasticsearch query response:

{
  "took": 133,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1728000,
    "max_score": 0.0,
    "hits": []
  },
  "aggregations": {
    "result": {
      "doc_count": 360,
      "result2": {
        "buckets": [
          {
            "key_as_string": "2016-01-01-18",
            "key": 1451672940000,
            "doc_count": 4,
            "max_of_field": {
              "value": 90.92765387779365
            }
          },
          {
            "key_as_string": "2016-01-01-18",
            "key": 1451673000000,
            "doc_count": 6,
            "max_of_field": {
              "value": 89.58087379178397
            }
          },
          {
            "key_as_string": "2016-01-01-18",
            "key": 1451673060000,
            "doc_count": 6,
            "max_of_field": {
              "value": 88.39341429374308
            }
          },
          {
            "key_as_string": "2016-01-01-18",
            "key": 1451673120000,
            "doc_count": 6,
            "max_of_field": {
              "value": 84.27665178871197
            }
          },
          {
            "key_as_string": "2016-01-01-18",
            "key": 1451673180000,
            "doc_count": 6,
            "max_of_field": {
              "value": 84.95048030509422
            }
          },
          ...

By inspection, we can see that the results are (within floating point tolerance) identical. We have done this by hand for a representative selection of queries for each benchmark run.

Successful query validation implies that the benchmarking suite has end-to-end reproducibility, and is correct between both databases.

Quickstart

Executing the benchmarks requires the Go compiler and tools to be installed on your system. See https://golang.org/doc/install for package downloads and installation. Once Go is configured you can proceed to installing and running the benchmark.

Install

Running benchmarks requires installing the data and query generators along with loaders and benchmarkers for the platforms you wish to test. For example, to install and run load tests for InfluxDB, execute:

go install github.com/influxdata/influxdb-comparisons/cmd/bulk_data_gen@latest github.com/influxdata/influxdb-comparisons/cmd/bulk_load_influx@latest

This will download and install the latest code from GitHub (including dependencies). Check the cmd directory for additional database implementations to download and install. For query benchmarking, install the query generator and benchmark executor for your platform. E.g. for InfluxDB:

go install github.com/influxdata/influxdb-comparisons/cmd/bulk_query_gen@latest github.com/influxdata/influxdb-comparisons/cmd/query_benchmarker_influxdb@latest

Help

For any module, you can run the executable with the -h flag and it will print a list of command line parameters. E.g.

-bash-4.1$ $GOPATH/bin/bulk_data_gen -h
Usage of /home/clarsen/go/bin/bulk_data_gen:
  -debug int
    	Debug printing (choices: 0, 1, 2) (default 0).
  -format string
    	Format to emit. (choices: influx-bulk, es-bulk, cassandra, mongo, opentsdb) (default "influx-bulk")
  -interleaved-generation-group-id uint
    	Group (0-indexed) to perform round-robin serialization within. Use this to scale up data generation to multiple processes.
  -interleaved-generation-groups uint
    	The number of round-robin serialization groups. Use this to scale up data generation to multiple processes. (default 1)
  -scale-var int
    	Scaling variable specific to the use case. (default 1)
  -seed int
    	PRNG seed (default, or 0, uses the current timestamp).
  -timestamp-end string
    	Ending timestamp (RFC3339). (default "2016-01-01T06:00:00Z")
  -timestamp-start string
    	Beginning timestamp (RFC3339). (default "2016-01-01T00:00:00Z")
  -use-case string
    	Use case to model. (choices: devops, iot) (default "devops")

Loading Data

To generate and write data to a database, execute the bulk data generator using optional command line parameters and pipe the output to a bulk loader. For example, load data in an InfluxDB instance, run:

$GOPATH/bin/bulk_data_gen | $GOPATH/bin/bulk_load_influx -urls http://localhost:8086

This will automatically create a database instance and load about 19,440 data points. For additional data, set the start and end times. Also note that the default generation data format is influx-bulk. If you want to test another database, use the -format parameter with the proper loader. E.g. for OpenTSDB:

$GOPATH/bin/bulk_data_gen -format opentsdb | $GOPATH/bin/bulk_load_opentsdb -urls http://localhost:4242

A successful run will the number of items generated and stored along with the total time and mean rate per second.

-bash-4.1$ $GOPATH/bin/bulk_data_gen | $GOPATH/bin/bulk_load_influx  -urls http://druidzoo-1.yms.gq1.yahoo.com:8086
using random seed 329234002
daemon URLs: [http://druidzoo-1.yms.gq1.yahoo.com:8086]
[worker 0] backoffs took a total of 0.000000sec of runtime
loaded 19440 items in 0.751433sec with 1 workers (mean rate 25870.568346/sec, 8.60MB/sec from stdin)

Querying Data

Querying the database is similar to loading data. Execute the bulk query generator and pipe it's output to the benchmark tool for the database under test. Each run requires a -query-type argument to determine what type of query to execute. These are meant to mimic actual queries such as searching for data on a single host out of many, multiple hosts from many or grouping by various tags. To find out what query types are available, execute $GOPATH/bin/bulk_query_gen -h and look for the use case matrix at the bottom of the output. An example run command looks like:

$GOPATH/bin/bulk_query_gen -query-type "1-host-1-hr" | $GOPATH/bin/query_benchmarker_influxdb -urls http://druidzoo-1.yms.gq1.yahoo.com:8086

A successful run will execute multiple queries and periodically print status information to standard out.

-bash-4.1$ $GOPATH/bin/bulk_query_gen -query-type "1-host-1-hr" | $GOPATH/bin/query_benchmarker_influxdb -urls http://druidzoo-1.yms.gq1.yahoo.com:8086
using random seed 684941023
after 100 queries with 1 workers:
Influx max cpu, rand    1 hosts, rand 1h0m0s by 1m : min:     1.50ms ( 668.55/sec), mean:     1.98ms ( 506.32/sec), max:    3.10ms (322.34/sec), count:      100, sum:   0.2sec
all queries                                        : min:     1.50ms ( 668.55/sec), mean:     1.98ms ( 506.32/sec), max:    3.10ms (322.34/sec), count:      100, sum:   0.2sec

...

run complete after 1000 queries with 1 workers:
Influx max cpu, rand    1 hosts, rand 1h0m0s by 1m : min:     1.45ms ( 689.62/sec), mean:     2.07ms ( 482.67/sec), max:   12.21ms ( 81.92/sec), count:     1000, sum:   2.1sec
all queries                                        : min:     1.45ms ( 689.62/sec), mean:     2.07ms ( 482.67/sec), max:   12.21ms ( 81.92/sec), count:     1000, sum:   2.1sec
wall clock time: 2.084896sec

More Repositories

1

influxdb

Scalable datastore for metrics, events, and real-time analytics
Rust
28,401
star
2

telegraf

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
Go
14,568
star
3

kapacitor

Open source framework for processing, monitoring, and alerting on time series data
Go
2,310
star
4

influxdb-python

Python client for InfluxDB
Python
1,689
star
5

chronograf

Open source monitoring and visualization UI for the TICK stack
TypeScript
1,480
star
6

influxdb-java

Java client for InfluxDB
Java
1,178
star
7

influxdb-relay

Service to replicate InfluxDB data for high availability
Python
830
star
8

flux

Flux is a lightweight scripting language for querying databases (like InfluxDB) and working with data. It's part of InfluxDB 1.7 and 2.0, but can be run independently of those.
FLUX
767
star
9

influxdb-client-python

InfluxDB 2.0 python client
Python
709
star
10

influxdb-client-go

InfluxDB 2 Go Client
Go
599
star
11

go-syslog

Blazing fast syslog parser
Go
478
star
12

sandbox

A sandbox for the full TICK stack
Shell
475
star
13

influxdb-client-java

InfluxDB 2 JVM Based Clients
Java
433
star
14

influxdb-php

influxdb-php: A PHP Client for InfluxDB, a time series database
PHP
431
star
15

influxdb-client-csharp

InfluxDB 2.x C# Client
C#
357
star
16

community-templates

InfluxDB Community Templates: Quickly collect & analyze time series data from a range of sources: Kubernetes, MySQL, Postgres, AWS, Nginx, Jenkins, and more.
Python
350
star
17

influxdb-client-js

InfluxDB 2.0 JavaScript client
TypeScript
326
star
18

influxdata-docker

Official docker images for the influxdata stack
Shell
314
star
19

rskafka

A minimal Rust client for Apache Kafka
Rust
292
star
20

docs.influxdata.com-ARCHIVE

ARCHIVE - 1.x docs for InfluxData
Less
252
star
21

helm-charts

Official Helm Chart Repository for InfluxData Applications
Mustache
226
star
22

influxdb-rails

Ruby on Rails bindings to automatically write metrics into InfluxDB
Ruby
212
star
23

influxdb-csharp

A .NET library for efficiently sending points to InfluxDB 1.x
C#
198
star
24

influxdb1-client

The old clientv2 for InfluxDB 1.x
Go
190
star
25

giraffe

A foundation for visualizations in the InfluxDB UI
TypeScript
183
star
26

influxql

Package influxql implements a parser for the InfluxDB query language.
Go
168
star
27

influxdb-client-php

InfluxDB (v2+) Client Library for PHP
PHP
149
star
28

tdigest

An implementation of Ted Dunning's t-digest in Go.
Go
133
star
29

influx-stress

New tool for generating artificial load on InfluxDB
Go
118
star
30

ui

UI for InfluxDB
TypeScript
93
star
31

tick-charts

A repository for Helm Charts for the full TICK Stack
Smarty
90
star
32

pbjson

Auto-generate serde implementations for prost types
Rust
89
star
33

telegraf-operator

telegraf-operator helps monitor application on Kubernetes with Telegraf
Go
80
star
34

inch

An InfluxDB benchmarking tool.
Go
78
star
35

influxdata-operator

A k8s operator for InfluxDB
Go
76
star
36

docs-v2

InfluxData Documentation that covers InfluxDB Cloud, InfluxDB OSS 2.x, InfluxDB OSS 1.x, InfluxDB Enterprise, Telegraf, Chronograf, Kapacitor, and Flux.
SCSS
72
star
37

wirey

Manage local wireguard interfaces in a distributed system
Go
66
star
38

influx-cli

CLI for managing resources in InfluxDB v2
Go
63
star
39

influxdb-go

61
star
40

terraform-aws-influx

Reusable infrastructure modules for running TICK stack on AWS
HCL
51
star
41

influxdb2-sample-data

Sample data for InfluxDB 2.0
JavaScript
46
star
42

influxdb-observability

Go
46
star
43

influxdb-client-ruby

InfluxDB 2.0 Ruby Client
Ruby
45
star
44

clockface

UI Kit for building Chronograf
TypeScript
44
star
45

grade

Track Go benchmark performance over time by storing results in InfluxDB
Go
43
star
46

influxdb-r

R library for InfluxDB
R
43
star
47

nginx-influxdb-module

C
39
star
48

nifi-influxdb-bundle

InfluxDB Processors For Apache NiFi
Java
36
star
49

line-protocol

Go
36
star
50

tensorflow-influxdb

Jupyter Notebook
34
star
51

iot-center-flutter

InlfuxDB 2.0 dart client flutter demo
Dart
34
star
52

whisper-migrator

A tool for migrating data from Graphite Whisper files to InfluxDB TSM files (version 0.10.0).
Go
33
star
53

flightsql-dbapi

DB API 2 interface for Flight SQL with SQLAlchemy extras.
Python
32
star
54

kube-influxdb

Configuration to monitor Kubernetes with the TICK stack
Shell
31
star
55

k8s-kapacitor-autoscale

Demonstration of using Kapacitor to autoscale a k8s deployment
Go
30
star
56

terraform-aws-influxdb

Deploys InfluxDB Enterprise to AWS
HCL
29
star
57

catslack

Shell -> Slack the easy way
Go
28
star
58

flux-lsp

Implementation of Language Server Protocol for the flux language
Rust
27
star
59

influxdb-operator

The Kubernetes operator for InfluxDB and the TICK stack.
Go
27
star
60

influxdb3_core

InfluxData's core functionality for InfluxDB Edge and IOx
Rust
26
star
61

influxdb-client-swift

InfluxDB (v2+) Client Library for Swift
Swift
26
star
62

influxdb-client-dart

InfluxDB (v2+) Client Library for Dart and Flutter
Dart
25
star
63

kapacitor-course

25
star
64

influxdb-c

C
25
star
65

vsflux

Flux language extension for VSCode
TypeScript
25
star
66

grafana-flightsql-datasource

Grafana plugin for Flight SQL APIs.
TypeScript
25
star
67

ansible-chrony

A role to manage chrony on Linux systems
Ruby
24
star
68

influxdb-scala

Scala client for InfluxDB
Scala
22
star
69

cron

A fast, zero-allocation cron parser in ragel and golang
Go
21
star
70

influxdb-plugin-fluent

A buffered output plugin for Fluentd and InfluxDB 2
Ruby
21
star
71

terraform-google-influx

Reusable infrastructure modules for running TICK stack on GCP
Shell
20
star
72

iot-api-python

Python
18
star
73

openapi

An OpenAPI specification for influx (cloud/oss) apis.
Shell
17
star
74

influxdb-university

InfluxDB University
Python
16
star
75

influxdb-client-r

InfluxDB (v2+) Client R Package
R
14
star
76

kafka-connect-influxdb

InfluxDB 2 Connector for Kafka
Scala
13
star
77

cd-gitops-reference-architecture

Details of the CD/GitOps architecture in use at InfluxData
Shell
13
star
78

iot-api-ui

Common React UI for iot-api-<js, python, etc.> example apps designed for InfluxDB client library tutorials.
TypeScript
13
star
79

oats

An OpenAPI to TypeScript generator.
TypeScript
12
star
80

awesome

SCSS
12
star
81

windows-packager

Create a windows installer
Shell
12
star
82

influxdb-gds-connector

Google Data Studio Connector for InfluxDB.
JavaScript
11
star
83

promql

Go
11
star
84

object_store_rs

Rust
10
star
85

yarpc

Yet Another RPC for Go
Go
10
star
86

ansible-influxdb-enterprise

Ansible role for deploying InfluxDB Enterprise.
10
star
87

influxdb-sample-data

Sample time series data used to test InfluxDB
9
star
88

ingen

ingen is a tool for directly generating TSM data
Go
9
star
89

parquet-bloom-filter-analysis

Generate Parquet Files
Rust
8
star
90

ansible-kapacitor

Official Kapacitor Ansible Role for Linux
Jinja
7
star
91

wlog

Simple log level based Go logger.
Go
7
star
92

iot-api-js

An example IoT app built with NextJS (NodeJS + React) and the InfluxDB API client library for Javascript.
JavaScript
7
star
93

influxdb-iox-client-go

InfluxDB/IOx Client for Go
Go
7
star
94

influxdb-templates

This repo is a collection of dashboard templates used in the InfluxDB UI.
JavaScript
7
star
95

k8s-jsonnet-libs

Jsonnet Libs repo - mostly generated with jsonnet-libs/k8s project
Jsonnet
7
star
96

google-deployment-manager-influxdb-enterprise

GCP Deployment Manager templates for InfluxDB Enterprise.
HTML
6
star
97

jaeger-influxdb

Go
6
star
98

influxdb-action

A GitHub action for setting up and configuring InfluxDB and the InfluxDB Cloud CLI
Shell
6
star
99

influxdb-fsharp

A F# client library for InfluxDB, a time series database http://influxdb.com
F#
6
star
100

qprof

A tool for profiling the performance of InfluxQL queries
Go
6
star