• Stars
    star
    104
  • Rank 330,604 (Top 7 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created over 9 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Prometheus Mesos Exporter

Prometheus Mesos Exporter

Build Status

Exporter for Mesos master and agent metrics.

Using

The Mesos Exporter can either expose cluster wide metrics from a master or task metrics from an agent.

Usage of mesos_exporter:
  -addr string
        Address to listen on (default ":9105")
  -clientCert string
        Path to Mesos client TLS certificate (.pem file)
  -clientKey string
        Path to Mesos client TLS key file (.pem file)
  -enableMasterState
        Enable collection from the master's /state endpoint (default true)
  -exportedSlaveAttributes string
        Comma-separated list of slave attributes to include in the corresponding metric
  -exportedTaskLabels string
        Comma-separated list of task labels to include in the corresponding metric
  -logLevel string
        Log level (default "error")
  -loginURL string
        URL for strict mode authentication (default "https://leader.mesos/acs/api/v1/auth/login")
  -master string
        Expose metrics from master running on this URL
  -password string
        Password for authentication
  -privateKey string
        File path to certificate for strict mode authentication
  -skipSSLVerify
        Skip SSL certificate verification
  -slave string
        Expose metrics from slave running on this URL
  -strictMode
        Use strict mode authentication
  -timeout duration
        Master polling timeout (default 10s)
  -trustedCerts string
        Comma-separated list of certificates (.pem files) trusted for requests to Mesos endpoints
  -username string
        Username for authentication
  -version
        Show version

When using HTTP or strict mode authentication, the following values are read from the environment, if they are not specified at run time:

  • MESOS_EXPORTER_USERNAME
  • MESOS_EXPORTER_PASSWORD
  • MESOS_EXPORTER_PRIVATE_KEY

When collecting metrics from the master, the -enableMasterState flag will enable the Mesos Exporter to fetch the master's state endpoint in order to publish metrics about the resources available on registered agents. In large clusters, polling this endpoint can degrade master performance. In this case, -enableMasterState can be disabled on the master exporter and equivalent metrics can be collected by running the Mesos Exporter on each agent.

When -enableMasterState is true, the master exporter will publish the following additional metrics labeled with the agent ID:

Metric Name
mesos_slave_cpus
mesos_slave_cpus_unreserved
mesos_slave_cpus_used
mesos_slave_disk_bytes
mesos_slave_disk_unreserved_bytes
mesos_slave_disk_used_bytes
mesos_slave_mem_bytes
mesos_slave_mem_unreserved_bytes
mesos_slave_mem_used_bytes
mesos_slave_ports
mesos_slave_ports_unreserved
mesos_slave_ports_used

Prometheus Configuration

Usually you would run one exporter with -master for each master and one exporter for each slave with -slave. Monitoring each master individually ensures that the cluster can be monitored even if the underlying Mesos cluster is in a degraded state.

  • Master: mesos_exporter -master http://localhost:5050
  • Agent: mesos_exporter -slave http://localhost:5051

The necessary Prometheus configuration could look like this:

- job_name: mesos-master
  scrape_interval: 15s
  scrape_timeout: 10s
  static_configs:
  - targets:
    - master1.mesos.example.org:9105
    - master2.mesos.example.org:9105
    - master3.mesos.example.org:9105

- job_name: mesos-slave
  scrape_interval: 15s
  scrape_timeout: 10s
  static_configs:
  - targets:
    - node1.mesos.example.org:9105
    - node2.mesos.example.org:9105
    - node3.mesos.example.org:9105

A minimal set of alerts to ensure your cluster is operational could then be defined as follows:

ALERT MesosDown
  IF (up{job=~"mesos.*"} == 0) or (irate(mesos_collector_errors_total[5m]) > 0)
  FOR 5m
  LABELS { severity="warning" }
  ANNOTATIONS {
    description="Either the exporter or the associated Mesos component is down.",
    summary="The Mesos instance {{$labels.instance}} cannot be scraped."
  }

ALERT MesosMasterLeader
  IF sum(mesos_master_elected{job="mesos-master"}) != 1
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    description="Agents and frameworks require a unique leading Mesos master.",
    summary="Expected one leading Mesos master but there are {{ $value }}."
  }

ALERT MesosMasterTooManyRestarts
  IF resets(mesos_master_uptime_seconds{job="mesos-master"}[1h]) > 10
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    description="The number of seconds the process has been running is resetting regularly.",
    summary="The Mesos master {{$labels.instance}} has restarted {{ $value }} times in the last hour."
  }

ALERT MesosSlaveActive
  IF sum(mesos_master_slaves_state{state="active"}) < 0.9 * count(up{job="mesos-slave"})
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    description="Mesos agents must be registered with the master in order to receive tasks.",
    summary="More than 10% of all Mesos agents dropped out. Only {{ $value }} active agents remaining."
  }

ALERT MesosSlaveTooManyRestarts
  IF resets(mesos_slave_uptime_seconds{job="mesos-slave"}[1h]) > 10
  FOR 5m
  LABELS { severity="page" }
  ANNOTATIONS {
    description="The number of seconds the process has been running is resetting regularly.",
    summary="The Mesos agent {{$labels.instance}} has restarted {{ $value }} times in the last hour."
  }

More Repositories

1

chronos

Fault tolerant job scheduler for Mesos which handles dependencies and ISO8601 based schedules
Scala
4,388
star
2

spark

Lightning-fast cluster computing in Java, Scala and Python.
Scala
1,426
star
3

mesos-go

Go language bindings for Apache Mesos
Go
543
star
4

mesos

PLEASE NOTE: Mesos is now hosted in Apache git! Get it using git clone https://git-wip-us.apache.org/repos/asf/mesos.git
C++
417
star
5

kafka

Apache Kafka on Apache Mesos
Scala
415
star
6

elasticsearch

Elasticsearch on Mesos
Java
242
star
7

hadoop

Hadoop on Mesos
Java
176
star
8

storm

Storm on Mesos!
Java
138
star
9

mr-redis

Redis on Apache Mesos
Go
127
star
10

cloudfoundry-mesos

Cloud Foundry on Mesos Framework
Go
107
star
11

logstash

Logstash on Mesos
Java
67
star
12

docker-compose-executor

DEPRECATED: Find the new docker compose executor here https://github.com/paypal/dce-go
Java
56
star
13

kibana

Kibana on Mesos
Shell
27
star
14

mesos-distcc

Distcc framework for Mesos.
Python
27
star
15

modules

Mesos modules examples and open source modules outside of the Apache Mesos source tree.
C++
26
star
16

cdh-mesos

Patched version of Cloudera's Distribution of Hadoop with Mesos support
Java
13
star
17

mih

Mesos-in-Hadoop: allows launching a Mesos cluster as a Hadoop job
Java
9
star
18

go-proto

Mesos protobuf bindings for Go.
Go
5
star
19

llvm

Mesos LLVM tools
Dockerfile
4
star
20

mesos-packaging

Packaging for Apache Mesos.
Shell
3
star
21

3rdparty

Collection of the 3rdparty dependencies bundled into Mesos.
3
star
22

homebrew-llvm

[DECPRECATED, USE https://github.com/mesos/llvm INSTEAD] Mesos LLVM tools for OS X
Ruby
1
star
23

mesos-rxjava

RxJava client for Apache Mesos HTTP APIs
Java
1
star