• Stars
    star
    892
  • Rank 51,172 (Top 2 %)
  • Language
    Java
  • License
    Apache License 2.0
  • Created almost 10 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Metrics exporter for Amazon AWS CloudWatch

CloudWatch Exporter

A Prometheus exporter for Amazon CloudWatch.

Alternatives

For ECS workloads, there is also an ECS exporter.

For a different approach to CloudWatch metrics, with automatic discovery, consider Yet Another CloudWatch Exporter (YACE).

Building and running

Cloudwatch Exporter requires at least Java 11.

mvn package to build.

java -jar target/cloudwatch_exporter-*-SNAPSHOT-jar-with-dependencies.jar 9106 example.yml to run.

The most recent pre-built JAR can be found at http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22cloudwatch_exporter%22

Credentials and permissions

The CloudWatch Exporter uses the AWS Java SDK, which offers a variety of ways to provide credentials. This includes the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.

The cloudwatch:ListMetrics, cloudwatch:GetMetricStatistics and cloudwatch:GetMetricData IAM permissions are required. The tag:GetResources IAM permission is also required to use the aws_tag_select feature.

Configuration

The configuration is in YAML.

An example with common options and aws_dimension_select:

---
region: eu-west-1
metrics:
 - aws_namespace: AWS/ELB
   aws_metric_name: RequestCount
   aws_dimensions: [AvailabilityZone, LoadBalancerName]
   aws_dimension_select:
     LoadBalancerName: [myLB]
   aws_statistics: [Sum]

A similar example with common options and aws_tag_select:

---
region: eu-west-1
metrics:
 - aws_namespace: AWS/ELB
   aws_metric_name: RequestCount
   aws_dimensions: [AvailabilityZone, LoadBalancerName]
   aws_tag_select:
     tag_selections:
       Monitoring: ["enabled"]
     resource_type_selection: "elasticloadbalancing:loadbalancer"
     resource_id_dimension: LoadBalancerName
   aws_statistics: [Sum]

Note: configuration examples for different namespaces can be found in examples directory

Note: A configuration builder can be found here.

Name Description
region Optional. The AWS region to connect to. If none is provided, an attempt will be made to determine the region from the default region provider chain.
role_arn Optional. The AWS role to assume. Useful for retrieving cross account metrics.
metrics Required. A list of CloudWatch metrics to retrieve and export
aws_namespace Required. Namespace of the CloudWatch metric.
aws_metric_name Required. Metric name of the CloudWatch metric.
aws_dimensions Required. This should contain exactly all the dimensions available for a metric. Run aws cloudwatch list-metrics to find out which dimensions you need to include for your metric.
aws_dimension_select Optional. Which dimension values to filter. Specify a map from the dimension name to a list of values to select from that dimension.
aws_dimension_select_regex Optional. Which dimension values to filter on with a regular expression. Specify a map from the dimension name to a list of regexes that will be applied to select from that dimension.
aws_tag_select Optional. A tag configuration to filter on, based on mapping from the tagged resource ID to a CloudWatch dimension.
tag_selections Optional, under aws_tag_select. Specify a map from a tag key to a list of tag values to apply tag filtering on resources from which metrics will be gathered.
resource_type_selection Required, under aws_tag_select. Specify the resource type to filter on. resource_type_selection should be comprised as service:resource_type, as per the resource group tagging API. Where resource_type could be an empty string, like in S3 case: resource_type_selection: "s3:".
resource_id_dimension Required, under aws_tag_select. For the current metric, specify which CloudWatch dimension maps to the ARN resource ID.
aws_statistics Optional. A list of statistics to retrieve, values can include Sum, SampleCount, Minimum, Maximum, Average. Defaults to all statistics unless extended statistics are requested.
aws_extended_statistics Optional. A list of extended statistics to retrieve. Extended statistics currently include percentiles in the form pN or pN.N.
delay_seconds Optional. The newest data to request. Used to avoid collecting data that has not fully converged. Defaults to 600s. Can be set globally and per metric.
range_seconds Optional. How far back to request data for. Useful for cases such as Billing metrics that are only set every few hours. Defaults to 600s. Can be set globally and per metric.
period_seconds Optional. Period to request the metric for. Only the most recent data point is used. Defaults to 60s. Can be set globally and per metric.
set_timestamp Optional. Boolean for whether to set the Prometheus metric timestamp as the original Cloudwatch timestamp. For some metrics which are updated very infrequently (such as S3/BucketSize), Prometheus may refuse to scrape them if this is set to true (see #100). Defaults to true. Can be set globally and per metric.
use_get_metric_data Optional. Boolean (experimental) Use GetMetricData API to get metrics instead of GetMetricStatistics.
list_metrics_cache_ttl Optional. Number of seconds to cache the result of calling the ListMetrics API. Defaults to 0 (no cache). Can be set globally and per metric.
warn_on_empty_list_dimensions Optional. Boolean Emit warning if the exporter cannot determine what metrics to request

The above config will export time series such as

# HELP aws_elb_request_count_sum CloudWatch metric AWS/ELB RequestCount Dimensions: ["AvailabilityZone","LoadBalancerName"] Statistic: Sum Unit: Count
# TYPE aws_elb_request_count_sum gauge
aws_elb_request_count_sum{job="aws_elb",instance="",load_balancer_name="mylb",availability_zone="eu-west-1c",} 42.0
aws_elb_request_count_sum{job="aws_elb",instance="",load_balancer_name="myotherlb",availability_zone="eu-west-1c",} 7.0

If the aws_tag_select feature was used, an additional information metric will be exported for each AWS tagged resource matched by the resource type selection and tag selection (if specified), such as

# HELP aws_resource_info AWS information available for resource
# TYPE aws_resource_info gauge
aws_resource_info{job="aws_elb",instance="",arn="arn:aws:elasticloadbalancing:eu-west-1:121212121212:loadbalancer/mylb",load_balancer_name="mylb",tag_Monitoring="enabled",tag_MyOtherKey="MyOtherValue",} 1.0

aws_recource_info can be joined with other metrics using group_left in PromQL such as the following:

  aws_elb_request_count_sum
* on(load_balancer_name) group_left(tag_MyOtherKey)
  aws_resource_info

All metrics are exported as gauges.

In addition cloudwatch_exporter_scrape_error will be non-zero if an error occurred during the scrape, and cloudwatch_exporter_scrape_duration_seconds contains the duration of that scrape. cloudwatch_exporter_build_info contains labels referencing the current build version and build release date.

Build Info Metric

cloudwatch_exporter_build_info is a default cloudwatch exporter metric that contains the current cloudwatch exporter version and release date as label values. The numeric metric value is statically set to 1. If the metrics label values are "unknown" the build information scrap failed.

CloudWatch doesn't always report data

Cloudwatch reports data either always or only in some cases, example only if there is a non-zero value. The CloudWatch Exporter mirrors this behavior, so you should refer to the Cloudwatch documentation to find out if your metric is always reported or not.

Timestamps

CloudWatch has been observed to sometimes take minutes for reported values to converge. The default delay_seconds will result in data that is at least 10 minutes old being requested to mitigate this. The samples exposed will have the timestamps of the data from CloudWatch, so usual staleness semantics will not apply and values will persist for 5m for instant vectors.

In practice this means that if you evaluate an instant vector at the current time, you will not see data from CloudWatch. An expression such as aws_elb_request_count_sum offset 10m will allow you to access the data, and should be used in recording rules and alerts.

For certain metrics which update relatively rarely, such as from S3, set_timestamp should be configured to false so that they are not exposed with a timestamp. This is as the true timestamp from CloudWatch could be so old that Prometheus would reject the sample.

Special handling for certain DynamoDB metrics

The DynamoDB metrics listed below break the usual CloudWatch data model.

  • ConsumedReadCapacityUnits
  • ConsumedWriteCapacityUnits
  • ProvisionedReadCapacityUnits
  • ProvisionedWriteCapacityUnits
  • ReadThrottleEvents
  • WriteThrottleEvents

When these metrics are requested in the TableName dimension CloudWatch will return data only for the table itself, not for its Global Secondary Indexes. Retrieving data for indexes requires requesting data across both the TableName and GlobalSecondaryIndexName dimensions. This behaviour is different to that of every other CloudWatch namespace and requires that the exporter handle these metrics differently to avoid generating duplicate HELP and TYPE lines.

When exporting one of the problematic metrics for an index the exporter will use a metric name in the format aws_dynamodb_METRIC_index_STATISTIC rather than the usual aws_dynamodb_METRIC_STATISTIC. The regular naming scheme will still be used when exporting these metrics for a table, and when exporting any other DynamoDB metrics not listed above.

Reloading Configuration

There are two ways to reload configuration:

  1. Send a SIGHUP signal to the pid: kill -HUP 1234
  2. POST to the reload endpoint: curl -X POST localhost:9106/-/reload

If an error occurs during the reload, check the exporter's log output.

Cost

Amazon charges for every CloudWatch API request or for every Cloudwatch metric requested, see the current charges.

  • In case of using GetMetricStatistics (default) - Every metric retrieved requires one API request, which can include multiple statistics.
  • In addition, when aws_dimensions is provided, the exporter needs to do API requests to determine what metrics to request. This should be negligible compared to the requests for the metrics themselves.

In the case that all aws_dimensions are provided in the aws_dimension_select list, the exporter will not perform the above API request. It will request all possible combination of values for those dimensions. This will reduce cost as the values for the dimensions do not need to be queried anymore, assuming that all possible value combinations are present in CloudWatch.

If you have 100 API requests every minute, with the price of USD$10 per million requests (as of Aug 2018), that is around $45 per month. The cloudwatch_requests_total counter tracks how many requests are being made.

When using the aws_tag_select feature, additional requests are made to the Resource Groups Tagging API, but these are free. The tagging_api_requests_total counter tracks how many requests are being made for these.

Experimental GetMetricData

We are transitioning to use GetMetricsData instead of GetMetricsStatistics. The benefits of using GetMetricsData is mainly around much better performence.

Please refer to this doc explaining why it is best practice to use GetMetricData

API performence Costs Stability
GetMetricStatistics May be slow at scale Charged per API request stable. (Default option)
GetMetricData Can retrieve data faster at scale Charged per metric requested New (opt-in via configuration)

Transition plan

At first this feature would be opt-in to allow you to decide when and how to test it On later versions we would swap the default so everyone can enjoy the benefits.

Cloudwatch exporter also expose a new self metric called cloudwatch_metrics_requested_total that allows you to track number of requested metrics in addition to the number of API requests.

Docker Images

To run the CloudWatch exporter on Docker, you can use the image from

The available tags are

  • main: snapshot updated on every push to the main branch
  • latest: the latest released version
  • vX.Y.Z: the specific version X.Y.Z. Note that up to version 0.11.0, the format was cloudwatch-exporter_X.Y.Z.

The image exposes port 9106 and expects the config in /config/config.yml. To configure it, you can bind-mount a config from your host:

docker run -p 9106 -v /path/on/host/config.yml:/config/config.yml quay.io/prometheus/cloudwatch-exporter

Specify the config as the CMD:

docker run -p 9106 -v /path/on/host/us-west-1.yml:/config/us-west-1.yml quay.io/prometheus/cloudwatch-exporter /config/us-west-1.yml

Or create a config file named config.yml along with following Dockerfile in the same directory and build it with docker build:

FROM prom/cloudwatch-exporter
ADD config.yml /config/

More Repositories

1

prometheus

The Prometheus monitoring system and time series database.
Go
54,496
star
2

node_exporter

Exporter for machine metrics
Go
10,870
star
3

alertmanager

Prometheus Alertmanager
Go
6,540
star
4

client_golang

Prometheus instrumentation library for Go applications
Go
5,367
star
5

blackbox_exporter

Blackbox prober exporter
Go
4,532
star
6

client_python

Prometheus instrumentation library for Python applications
Python
3,914
star
7

jmx_exporter

A process for exposing JMX Beans via HTTP for Prometheus consumption
Java
3,005
star
8

pushgateway

Push acceptor for ephemeral and batch jobs.
Go
2,969
star
9

client_java

Prometheus instrumentation library for JVM applications
Java
2,166
star
10

mysqld_exporter

Exporter for MySQL server metrics
Go
2,097
star
11

snmp_exporter

SNMP Exporter for Prometheus
Go
1,634
star
12

statsd_exporter

StatsD to Prometheus metrics exporter
Go
912
star
13

procfs

procfs provides functions to retrieve system, kernel and process metrics from the pseudo-filesystem proc.
Go
767
star
14

docs

Prometheus documentation: content and static site generator
SCSS
645
star
15

haproxy_exporter

Simple server that scrapes HAProxy stats and exports them via HTTP for Prometheus consumption
Go
615
star
16

promlens

PromLens – The query builder, analyzer, and explainer for PromQL
TypeScript
552
star
17

client_ruby

Prometheus instrumentation library for Ruby applications
Ruby
510
star
18

client_rust

Prometheus / OpenMetrics client library in Rust
Rust
462
star
19

consul_exporter

Exporter for Consul metrics
Go
436
star
20

prom2json

A tool to scrape a Prometheus client and dump the result as JSON.
Go
364
star
21

graphite_exporter

Server that accepts metrics via the Graphite protocol and exports them as Prometheus metrics
Go
350
star
22

promu

Prometheus Utility Tool
Go
268
star
23

influxdb_exporter

A server that accepts InfluxDB metrics via the HTTP API and exports them via HTTP for Prometheus consumption
Go
261
star
24

exporter-toolkit

Utility package to build exporters
Go
261
star
25

common

Go libraries shared across Prometheus components and libraries.
Go
261
star
26

collectd_exporter

A server that accepts collectd stats via HTTP POST and exports them via HTTP for Prometheus consumption
Go
255
star
27

memcached_exporter

Exports metrics from memcached servers for consumption by Prometheus.
Go
182
star
28

test-infra

Prometheus E2E benchmarking tool
Go
153
star
29

compliance

A set of tests to check compliance with various Prometheus interfaces
Go
127
star
30

nagios_plugins

Nagios plugins for alerting on Prometheus query results
Shell
103
star
31

demo-site

Demo site auto-deployed with Ansible and Travis CI.
HTML
96
star
32

client_model

Data model artifacts for Prometheus.
Makefile
74
star
33

golang-builder

Prometheus Golang builder Docker images
Shell
69
star
34

codemirror-promql

PromQL support for the CodeMirror code editor
TypeScript
39
star
35

busybox

Prometheus Busybox Docker base images
Makefile
37
star
36

prometheus_api_client_ruby

A Ruby library for reading metrics stored on a Prometheus server
Ruby
36
star
37

talks

Track Prometheus talks
20
star
38

lezer-promql

A lezer-based PromQL grammar
JavaScript
12
star
39

proposals

Design documents for Prometheus Ecosystem
Makefile
9
star
40

host_exporter

See the "node_exporter" repository instead!
8
star
41

circleci

7
star
42

snmp_exporter_mibs

4
star
43

promci

GitHub Actions repository
4
star
44

kube-demo-site

Kubernetes Demo Site
Go
1
star
45

client_java-benchmarks

1
star
46

sigv4

A http.RoundTripper that will sign requests using Amazon's Signature Verification V4 signing procedure
1
star