• Stars
    star
    912
  • Rank 50,097 (Top 1.0 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created over 11 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

StatsD to Prometheus metrics exporter

statsd exporter Build Status

CircleCI Docker Repository on Quay Docker Pulls

statsd_exporter receives StatsD-style metrics and exports them as Prometheus metrics.

Overview

The StatsD exporter is a drop-in replacement for StatsD. This exporter translates StatsD metrics to Prometheus metrics via configured mapping rules.

We recommend using the exporter only as an intermediate solution, and switching to native Prometheus instrumentation in the long term. While it is common to run centralized StatsD servers, the exporter works best as a sidecar.

Transitioning from an existing StatsD setup

The relay feature allows for a gradual transition.

Introduce the exporter by adding it as a sidecar alongside the application instances. In Kubernetes, this means adding it to the pod. Use the --statsd.relay.address to forward metrics to your existing StatsD UDP endpoint. Relaying forwards statsd events unmodified, preserving the original metric name and tags in any format.

+-------------+    +----------+                  +------------+
| Application +--->| Exporter +----------------->|  StatsD    |
+-------------+    +----------+                  +------------+
                          ^
                          |                      +------------+
                          +----------------------+ Prometheus |
                                                 +------------+

Relaying from StatsD

To pipe metrics from an existing StatsD environment into Prometheus, configure StatsD's repeater backend to repeat all received metrics to a statsd_exporter process.

+----------+                         +-------------------+                        +--------------+
|  StatsD  |---(UDP/TCP repeater)--->|  statsd_exporter  |<---(scrape /metrics)---|  Prometheus  |
+----------+                         +-------------------+                        +--------------+

This allows trying out the exporter with minimal effort, but does not provide the per-instance metrics of the sidecar pattern.

Tagging Extensions

The exporter supports Librato, InfluxDB, DogStatsD, and SignalFX-style tags, which will be converted into Prometheus labels.

For Librato-style tags, they must be appended to the metric name with a delimiting #, as so:

metric.name#tagName=val,tag2Name=val2:0|c

See the statsd-librato-backend README for a more complete description.

For InfluxDB-style tags, they must be appended to the metric name with a delimiting comma, as so:

metric.name,tagName=val,tag2Name=val2:0|c

See this InfluxDB blog post for a larger overview.

For DogStatsD-style tags, they're appended as a |# delimited section at the end of the metric, as so:

metric.name:0|c|#tagName:val,tag2Name:val2

See Tags in the DogStatsD documentation for the concept description and Datagram Format. If you encounter problems, note that this tagging style is incompatible with the original statsd implementation.

For SignalFX dimension, add the tags to the metric name in square brackets, as so:

metric.name[tagName=val,tag2Name=val2]:0|c

Be aware: If you mix tag styles (e.g., Librato/InfluxDB with DogStatsD), the exporter will consider this an error and the behavior is undefined. Also, tags without values (#some_tag) are not supported and will be ignored.

The exporter parses all tagging formats by default, but individual tagging formats can be disabled with command line flags:

--no-statsd.parse-dogstatsd-tags
--no-statsd.parse-influxdb-tags
--no-statsd.parse-librato-tags
--no-statsd.parse-signalfx-tags

Building and Running

NOTE: Version 0.7.0 switched to the kingpin flags library. With this change, flag behaviour is POSIX-ish:

  • long flags start with two dashes (--version)

  • boolean long flags are disabled by prefixing with no (--flag-name is true, --no-flag-name is false)

  • multiple short flags can be combined (but there currently is only one)

  • flag processing stops at the first --

    usage: statsd_exporter [<flags>]
    
    Flags:
      -h, --help                    Show context-sensitive help (also try
                                    --help-long and --help-man).
          --web.listen-address=":9102"
                                    The address on which to expose the web interface
                                    and generated Prometheus metrics.
          --web.enable-lifecycle    Enable shutdown and reload via HTTP request.
          --web.telemetry-path="/metrics"
                                    Path under which to expose metrics.
          --statsd.listen-udp=":9125"
                                    The UDP address on which to receive statsd
                                    metric lines. "" disables it.
          --statsd.listen-tcp=":9125"
                                    The TCP address on which to receive statsd
                                    metric lines. "" disables it.
          --statsd.listen-unixgram=""
                                    The Unixgram socket path to receive statsd
                                    metric lines in datagram. "" disables it.
          --statsd.unixsocket-mode="755"
                                    The permission mode of the unix socket.
          --statsd.mapping-config=STATSD.MAPPING-CONFIG
                                    Metric mapping configuration file name.
          --statsd.read-buffer=STATSD.READ-BUFFER
                                    Size (in bytes) of the operating system's
                                    transmit read buffer associated with the UDP or
                                    Unixgram connection. Please make sure the kernel
                                    parameters net.core.rmem_max is set to a value
                                    greater than the value specified.
          --statsd.cache-size=1000  Maximum size of your metric mapping cache.
                                    Relies on least recently used replacement policy
                                    if max size is reached.
          --statsd.cache-type=lru   Metric mapping cache type. Valid options are
                                    "lru" and "random"
          --statsd.event-queue-size=10000
                                    Size of internal queue for processing events
          --statsd.event-flush-threshold=1000
                                    Number of events to hold in queue before
                                    flushing
          --statsd.event-flush-interval=200ms
                                    Maximum time between event queue flushes.
          --debug.dump-fsm=""       The path to dump internal FSM generated for
                                    glob matching as Dot file.
          --check-config            Check configuration and exit.
          --statsd.parse-dogstatsd-tags  
                                    Parse DogStatsd style tags. Enabled by default.
          --statsd.parse-influxdb-tags  
                                    Parse InfluxDB style tags. Enabled by default.
          --statsd.parse-librato-tags  
                                    Parse Librato style tags. Enabled by default.
          --statsd.parse-signalfx-tags  
                                    Parse SignalFX style tags. Enabled by default.
          --statsd.relay.address=STATSD.RELAY.ADDRESS  
                                    The UDP relay target address (host:port)
          --statsd.relay.packet-length=1400  
                                    Maximum relay output packet length to avoid fragmentation
          --log.level=info          Only log messages with the given severity or
                                    above. One of: [debug, info, warn, error]
          --log.format=logfmt       Output format of log messages. One of: [logfmt,
                                    json]
          --version                 Show application version.
    

Lifecycle API

The statsd_exporter has an optional lifecycle API (disabled by default) that can be used to reload or quit the exporter by sending a PUT or POST request to the /-/reload or /-/quit endpoints.

Relay

The statsd_exporter has an optional mode that will buffer and relay incoming statsd lines to a remote server. This is useful to "tee" the data when migrating to using the exporter. The relay will flush the buffer at least once per second to avoid delaying delivery of metrics.

Tests

$ go test

Metric Mapping and Configuration

The statsd_exporter can be configured to translate specific dot-separated StatsD metrics into labeled Prometheus metrics via a simple mapping language. The config file is reloaded on SIGHUP.

A mapping definition starts with a line matching the StatsD metric in question, with *s acting as wildcards for each dot-separated metric component. The lines following the matching expression must contain one label="value" pair each, and at least define the metric name (label name name). The Prometheus metric is then constructed from these labels. $n-style references in the label value are replaced by the n-th wildcard match in the matching line, starting at 1. Multiple matching definitions are separated by one or more empty lines. The first mapping rule that matches a StatsD metric wins.

Metrics that don't match any mapping in the configuration file are translated into Prometheus metrics without any labels and with any non-alphanumeric characters, including periods, translated into underscores.

In general, the different metric types are translated as follows:

StatsD gauge   -> Prometheus gauge

StatsD counter -> Prometheus counter

StatsD timer, histogram, distribution   -> Prometheus summary or histogram

Glob matching

The default (and fastest) glob mapping style uses * to denote parts of the statsd metric name that may vary. These varying parts can then be referenced in the construction of the Prometheus metric name and labels.

An example mapping configuration:

mappings:
- match: "test.dispatcher.*.*.*"
  name: "dispatcher_events_total"
  labels:
    processor: "$1"
    action: "$2"
    outcome: "$3"
    job: "test_dispatcher"
- match: "*.signup.*.*"
  name: "signup_events_total"
  labels:
    provider: "$2"
    outcome: "$3"
    job: "${1}_server"

This would transform these example StatsD metrics into Prometheus metrics as follows:

test.dispatcher.FooProcessor.send.success
 => dispatcher_events_total{processor="FooProcessor", action="send", outcome="success", job="test_dispatcher"}

foo_product.signup.facebook.failure
 => signup_events_total{provider="facebook", outcome="failure", job="foo_product_server"}

test.web-server.foo.bar
 => test_web_server_foo_bar{}

Each mapping in the configuration file must define a name for the metric. The metric's name can contain $n-style references to be replaced by the n-th wildcard match in the matching line. That allows for dynamic rewrites, such as:

mappings:
- match: "test.*.*.counter"
  name: "${2}_total"
  labels:
    provider: "$1"

Glob matching offers the best performance for common mappings.

Ordering glob rules

List more specific matches before wildcards, from left to right:

a.b.c
a.b.*
a.*.d
a.*.*

This avoids unexpected shadowing of later rules, and performance impact from backtracking.

Alternatively, you can disable mapping ordering altogether. With unordered mapping, at each hierarchy level the most specific match wins. This has the same effect as using the recommended ordering.

Regular expression matching

The regex mapping style uses regular expressions to match the full statsd metric name. Use it if the glob mapping is not flexible enough to pull structured data from the available statsd metric names.

Regular expression matching is significantly slower than glob mapping as all mappings must be tested in order. Because of this, regex mappings are only executed after all glob mappings. In other words, glob mappings take preference over regex matches, irrespective of the order in which they are specified. Regular expression matches are always evaluated in order, and the first match wins.

The metric name can also contain references to regex matches. The mapping above could be written as:

mappings:
- match: "test\\.(\\w+)\\.(\\w+)\\.counter"
  match_type: regex
  name: "${2}_total"
  labels:
    provider: "$1"
- match: "(.*)\\.(.*)--(.*)\\.status\.(.*)\\.count"
  match_type: regex
  name: "request_total"
  labels:
    hostname: "$1"
    exec: "$2"
    protocol: "$3"
    code: "$4"

Be aware about yaml escape rules as a mapping like the following one will not work.

mappings:
- match: "test\\.(\w+)\\.(\w+)\\.counter"
  match_type: regex
  name: "${2}_total"
  labels:
    provider: "$1"

Special match groups

When using regex, the match group 0 is the full match and can be used to attach labels to the metric. Example:

mappings:
- match: ".+"
  match_type: regex
  name: "$0"
  labels:
    statsd_metric_name: "$0"

If a metric my.statsd_counter is received, the metric name will still be mapped to my_statsd_counter (Prometheus compatible name). But the metric will also have the label statsd_metric_name with the value my.statsd_counter (unchanged value).

Note: If you use the match like the example (i.e. .+), be aware that it will be a "catch-all" block. So it should come at the very end of the mapping list.

The same is not achievable with glob matching, for more details check this issue.

Naming, labels, and help

Please note that metrics with the same name must also have the same set of label names.

If the default metric help text is insufficient for your needs you may use the YAML configuration to specify a custom help text for each mapping:

mappings:
- match: "http.request.*"
  help: "Total number of http requests"
  name: "http_requests_total"
  labels:
    code: "$1"

StatsD timers and distributions

By default, statsd timers and distributions (collectively "observers") are represented as a Prometheus summary with quantiles. You may optionally configure the quantiles and acceptable error, as well as adjusting how the summary metric is aggregated:

mappings:
- match: "test.timing.*.*.*"
  observer_type: summary
  name: "my_timer"
  labels:
    provider: "$2"
    outcome: "$3"
    job: "${1}_server"
  summary_options:
    quantiles:
      - quantile: 0.99
        error: 0.001
      - quantile: 0.95
        error: 0.01
      - quantile: 0.9
        error: 0.05
      - quantile: 0.5
        error: 0.005
    max_age: 30s
    age_buckets: 3
    buf_cap: 1000

The default quantiles are 0.99, 0.9, and 0.5.

The default summary age is 10 minutes, the default number of buckets is 5 and the default buffer size is 500. See also the golang_client docs. The max_summary_age corresponds to SummaryOptions.MaxAge, summary_age_buckets to SummaryOptions.AgeBuckets and stream_buffer_size to SummaryOptions.BufCap.

In the configuration, one may also set the observer type to "histogram". For example, to set the observer type for a single timer metric:

mappings:
- match: "test.timing.*.*.*"
  observer_type: histogram
  histogram_options:
    buckets: [ 0.01, 0.025, 0.05, 0.1 ]
    native_histogram_bucket_factor: 1.1
    native_histogram_max_buckets: 256
  name: "my_timer"
  labels:
    provider: "$2"
    outcome: "$3"
    job: "${1}_server"

If not set, then the default Prometheus client values are used for the histogram buckets: [.005, .01, .025, .05, .1, .25, .5, 1, 2.5, 5, 10]. +Inf is added automatically. If your Prometheus server is enabled to scrape native histograms (v2.40.0+), then you can set the native_histogram_bucket_factor to configure precision of the buckets in the sparse histogram. More about this in the original client_golang docs. Also, a configuration of the maximum number of buckets can be set with native_histogram_max_buckets, this avoids the histograms to grow too large in memory. More about this in the original client_golang docs.

observer_type is only used when the statsd metric type is a timer, histogram, or distribution. buckets is only used when the statsd metric type is one of these, and the observer_type is set to histogram.

Timers will be accepted with the ms statsd type. Statsd timer data is transmitted in milliseconds, while Prometheus expects the unit to be seconds. The exporter converts all timer observations to seconds.

Histogram and distribution events (h and d metric type) are not subject to unit conversion.

DogStatsD Client Behavior

timed() decorator

The DogStatsD client's timed decorator emits the metric in seconds but uses the ms type. Set use_ms=True to send the correct units.

Regular expression matching

Another capability when using YAML configuration is the ability to define matches using raw regular expressions as opposed to the default globbing style of match. This may allow for pulling structured data from otherwise poorly named statsd metrics AND allow for more precise targetting of match rules. When no match_type parameter is specified the default value of glob will be assumed:

mappings:
- match: "(.*)\\.(.*)--(.*)\\.status\\.(.*)\\.count"
  match_type: regex
  name: "request_total"
  labels:
    hostname: "$1"
    exec: "$2"
    protocol: "$3"
    code: "$4"

Global defaults

One may also set defaults for the observer type, histogram options, summary options, and match type. These will be used by all mappings that do not define them.

An option that can only be configured in defaults is glob_disable_ordering, which is false if omitted. By setting this to true, glob match type will not honor the occurance of rules in the mapping rules file and always treat * as lower priority than a concrete string.

Setting buckets or quantiles in the defaults is deprecated in favor of histogram_options and summary_options, which will override the deprecated values.

If summary_options is present in a mapping config, it will only override the fields set in the mapping. Unset fields in the mapping will take the values from the defaults.

defaults:
  observer_type: histogram
  histogram_options:
    buckets: [.005, .01, .025, .05, .1, .25, .5, 1, 2.5 ]
    native_histogram_bucket_factor: 1.1
    native_histogram_max_buckets: 256
  summary_options:
    quantiles:
      - quantile: 0.99
        error: 0.001
      - quantile: 0.95
        error: 0.01
      - quantile: 0.9
        error: 0.05
      - quantile: 0.5
        error: 0.005
    max_age: 5m
    age_buckets: 2
    buf_cap: 1000
  match_type: glob
  glob_disable_ordering: false
  ttl: 0 # metrics do not expire
mappings:
# This will be a histogram using the buckets set in `defaults`.
- match: "test.timing.*.*.*"
  name: "my_timer"
  labels:
    provider: "$2"
    outcome: "$3"
    job: "${1}_server"
# This will be a summary using the summary_options set in `defaults`
- match: "other.distribution.*.*.*"
  observer_type: summary
  name: "other_distribution"
  labels:
    provider: "$2"
    outcome: "$3"
    job: "${1}_server_other"

drop action

You may also drop metrics by specifying a "drop" action on a match. For example:

mappings:
# This metric would match as normal.
- match: "test.timing.*.*.*"
  name: "my_timer"
  labels:
    provider: "$2"
    outcome: "$3"
    job: "${1}_server"
# Any metric not matched will be dropped because "." matches all metrics.
- match: "."
  match_type: regex
  action: drop
  name: "dropped"

You can drop any metric using the normal match syntax. The default action is "map" which does the normal metrics mapping.

Explicit metric type mapping

StatsD allows emitting of different metric types under the same metric name, but the Prometheus client library can't merge those. For this use-case the mapping definition allows you to specify which metric type to match:

mappings:
- match: "test.foo.*"
  name: "test_foo"
  match_metric_type: counter
  labels:
    provider: "$1"

Possible values for match_metric_type are gauge, counter and observer.

Mapping cache size and cache replacement policy

There is a cache used to improve the performance of the metric mapping, that can greatly improvement performance. The cache has a default maximum of 1000 unique statsd metric names -> prometheus metrics mappings that it can store. This maximum can be adjusted using the statsd.cache-size flag.

If the maximum is reached, entries are by default rotated using the least recently used replacement policy. This strategy is optimal when memory is constrained as only the most recent entries are retained.

Alternatively, you can choose a random-replacement cache strategy. This is less optimal if the cache is smaller than the cacheable set, but requires less locking. Use this for very high throughput, but make sure to allow for a cache that holds all metrics.

The optimal cache size is determined by the cardinality of the incoming metrics.

Time series expiration

The ttl parameter can be used to define the expiration time for stale metrics. The value is a time duration with valid time units: "ns", "us" (or "µs"), "ms", "s", "m", "h". For example, ttl: 1m20s. 0 value is used to indicate metrics that do not expire.

TTL configuration is stored for each mapped metric name/labels combination whenever new samples are received. This means that you cannot immediately expire a metric only by changing the mapping configuration. At least one sample must be received for updated mappings to take effect.

Unit conversions

The scale parameter can be used to define unit conversions for metric values. The value is a floating point number to scale metric values by. This can be useful for converting non-base units (e.g. milliseconds, kilobytes) to base units (e.g. seconds, bytes) as recommended in prometheus best practices.

mappings:
- match: foo.latency_ms
  name: foo_latency_seconds
  scale: 0.001
- match: bar.processed_kb
  name: bar_processed_bytes
  scale: 1024
- match: baz.latency_us
  name: baz_latency_seconds
  scale: 1e-6

Event flushing configuration

Internally statsd_exporter runs a goroutine for each network listener (UDP, TCP & Unix Socket). These each receive and parse metrics received into an event. For performance purposes, these events are queued internally and flushed to the main exporter goroutine periodically in batches. The size of this queue and the flush criteria can be tuned with the --statsd.event-queue-size, --statsd.event-flush-threshold and --statsd.event-flush-interval. However, the defaults should perform well even for very high traffic environments.

Using Docker

You can deploy this exporter using the prom/statsd-exporter Docker image.

For example:

docker pull prom/statsd-exporter

docker run -d -p 9102:9102 -p 9125:9125 -p 9125:9125/udp \
        -v $PWD/statsd_mapping.yml:/tmp/statsd_mapping.yml \
        prom/statsd-exporter --statsd.mapping-config=/tmp/statsd_mapping.yml

Library packages

Parts of the implementation of this exporter are available as separate packages. See the documentation for details.

For the time being, there are no stability guarantees for library interfaces. We will try to call out any significant changes in the changelog. Semantic versioning of the exporter is based on the impact on users of the exporter, not users of the library.

We encourage re-use of these packages and welcome issues related to their usability as a library.

More Repositories

1

prometheus

The Prometheus monitoring system and time series database.
Go
54,496
star
2

node_exporter

Exporter for machine metrics
Go
10,870
star
3

alertmanager

Prometheus Alertmanager
Go
6,540
star
4

client_golang

Prometheus instrumentation library for Go applications
Go
5,367
star
5

blackbox_exporter

Blackbox prober exporter
Go
4,532
star
6

client_python

Prometheus instrumentation library for Python applications
Python
3,914
star
7

jmx_exporter

A process for exposing JMX Beans via HTTP for Prometheus consumption
Java
3,005
star
8

pushgateway

Push acceptor for ephemeral and batch jobs.
Go
2,969
star
9

client_java

Prometheus instrumentation library for JVM applications
Java
2,166
star
10

mysqld_exporter

Exporter for MySQL server metrics
Go
2,097
star
11

snmp_exporter

SNMP Exporter for Prometheus
Go
1,634
star
12

cloudwatch_exporter

Metrics exporter for Amazon AWS CloudWatch
Java
892
star
13

procfs

procfs provides functions to retrieve system, kernel and process metrics from the pseudo-filesystem proc.
Go
767
star
14

docs

Prometheus documentation: content and static site generator
SCSS
645
star
15

haproxy_exporter

Simple server that scrapes HAProxy stats and exports them via HTTP for Prometheus consumption
Go
615
star
16

promlens

PromLens – The query builder, analyzer, and explainer for PromQL
TypeScript
552
star
17

client_ruby

Prometheus instrumentation library for Ruby applications
Ruby
510
star
18

client_rust

Prometheus / OpenMetrics client library in Rust
Rust
462
star
19

consul_exporter

Exporter for Consul metrics
Go
436
star
20

prom2json

A tool to scrape a Prometheus client and dump the result as JSON.
Go
364
star
21

graphite_exporter

Server that accepts metrics via the Graphite protocol and exports them as Prometheus metrics
Go
350
star
22

promu

Prometheus Utility Tool
Go
268
star
23

influxdb_exporter

A server that accepts InfluxDB metrics via the HTTP API and exports them via HTTP for Prometheus consumption
Go
261
star
24

exporter-toolkit

Utility package to build exporters
Go
261
star
25

common

Go libraries shared across Prometheus components and libraries.
Go
261
star
26

collectd_exporter

A server that accepts collectd stats via HTTP POST and exports them via HTTP for Prometheus consumption
Go
255
star
27

memcached_exporter

Exports metrics from memcached servers for consumption by Prometheus.
Go
182
star
28

test-infra

Prometheus E2E benchmarking tool
Go
153
star
29

compliance

A set of tests to check compliance with various Prometheus interfaces
Go
127
star
30

nagios_plugins

Nagios plugins for alerting on Prometheus query results
Shell
103
star
31

demo-site

Demo site auto-deployed with Ansible and Travis CI.
HTML
96
star
32

client_model

Data model artifacts for Prometheus.
Makefile
74
star
33

golang-builder

Prometheus Golang builder Docker images
Shell
69
star
34

codemirror-promql

PromQL support for the CodeMirror code editor
TypeScript
39
star
35

busybox

Prometheus Busybox Docker base images
Makefile
37
star
36

prometheus_api_client_ruby

A Ruby library for reading metrics stored on a Prometheus server
Ruby
36
star
37

talks

Track Prometheus talks
20
star
38

lezer-promql

A lezer-based PromQL grammar
JavaScript
12
star
39

proposals

Design documents for Prometheus Ecosystem
Makefile
9
star
40

host_exporter

See the "node_exporter" repository instead!
8
star
41

circleci

7
star
42

snmp_exporter_mibs

4
star
43

promci

GitHub Actions repository
4
star
44

kube-demo-site

Kubernetes Demo Site
Go
1
star
45

client_java-benchmarks

1
star
46

sigv4

A http.RoundTripper that will sign requests using Amazon's Signature Verification V4 signing procedure
1
star