• Stars
    star
    145
  • Rank 249,091 (Top 5 %)
  • Language
    Scala
  • License
    BSD 3-Clause "New...
  • Created about 8 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A library to expose more of Apache Spark's metrics system

#spark-metrics

A library to expose more of Apache Spark's metrics system. This library allows you to use APIs like the Dropwizard/Codahale Metrics library on Spark applications to publish metrics that are aggregated across all executors.

Dependencies

Spark 2.x

spark-metrics by default will be targeting the Spark 2.x line of releases. To use this library for a Spark 2.x application, add the following dependency:

<dependency>
    <groupId>com.groupon.dse</groupId>
    <artifactId>spark-metrics</artifactId>
    <version>2.0.0</version>
</dependency>

Note that spark-metrics targets Scala 2.11 by default, as that is the default Scala version supported by Spark 2.x. To support Scala 2.10 on the Spark 2.x releases, this library will need to be recompiled with the Spark dependencies that target Scala 2.10.

Spark 1.x

To use this library with the Spark 1.x line of releases, add a dependency to spark-metrics_spark-1.x instead:

<dependency>
    <groupId>com.groupon.dse</groupId>
    <artifactId>spark-metrics_spark-1.x</artifactId>
    <version>2.0.0</version>
</dependency>

The default Spark version targeted for spark-metrics_spark-1.x is Spark 1.6.3, but it is compatible with 1.5 and 1.4 as well. Note that spark-metrics_spark-1.x targets Scala 2.10 by default, as that is the default Scala version supported by Spark 1.x. To support Scala 2.11 on the Spark 1.x releases, this library will need to be recompiled with the Spark dependencies that target Scala 2.11.

Updates to spark-metrics will be backported to spark-metrics_spark-1.x whenever possible, but support for spark-metrics_spark-1.x will be discontinued at some point in the future.

Usage

Include this import in your main Spark application file:

import org.apache.spark.groupon.metrics.UserMetricsSystem

In the Spark driver, add the following call to UserMetricsSystem.initialize() right after the application's SparkContext is instantiated:

val sparkContext = new SparkContext()
UserMetricsSystem.initialize(sparkContext, "MyMetricNamespace")

(Technically, it need not necessarily be right after the SparkContext is created, so long as initialize is called before SparkMetric instances are created. But invoking it here will help prevent issues related to initialization from occuring, so it is highly recommended.)

After this, you can create SparkMetric instances that report to Spark's metrics servlet anywhere in your application. These instances must be declared lazy for this library to work properly.

lazy val counter: SparkCounter = UserMetricsSystem.counter("MyCounter")

lazy val gauge: SparkGauge = UserMetricsSystem.gauge("MyGauge")

lazy val histogram: SparkHistogram = UserMetricsSystem.histogram("MyHistogram")

lazy val meter: SparkMeter = UserMetricsSystem.meter("MyMeter")

lazy val timer: SparkTimer = UserMetricsSystem.timer("MyTimer")

The metricName parameter is the only identifier for metrics, so different metric types cannot have the same name (e.g. a Counter and Histogram both with the same metricName).

The APIs for these are kept as close as possible to Dropwizard's APIs, but they don't actually extend a common interface at the language level. The only significant difference is in the SparkGauge class. Whereas Dropwizard's Gauge class is instantiated basically by passing in a function that knows how to obtain the value for the Gauge, this version simply has a set() method to set its value.

Viewing and Publishing Metrics

This library is integrated with Spark's built-in metrics servlet. This means that metrics collected using this library are visible at the /metrics/json/ endpoint. Here, any of the metrics from this library will be published with the key <appName>.<metricNamespace>.<metricName>.

These metric JSONs look something like this:

application_1454970304040_0030.driver.MyAppName.MyMetricNamespace.MyTimer: {
    count: 4748,
    max: 21683.55228,
    mean: 662.7780978119895,
    min: 434.211779,
    p50: 622.795788,
    p75: 672.358402,
    p95: 1146.214833,
    p98: 1146.214833,
    p99: 1146.214833,
    p999: 1572.286154,
    stddev: 163.37016417936547,
    m15_rate: 0.06116903443100036,
    m1_rate: 0.019056182723172856,
    m5_rate: 0.051904011476711955,
    mean_rate: 0.06656686539563786,
    duration_units: "milliseconds",
    rate_units: "calls/second"
}

Other methods of publishing these metrics are also possible by configuring Spark. Any of the sinks listed here will also report the metrics collected by this library as long as the driver instance is enabled. See this page for a sample metrics configuration.

How It Works

This library is implemented using a combination of Spark's internal RPC APIs and the Dropwizard APIs. The Dropwizard APIs are used on the driver to aggregate metrics that get collected across different executors and the driver. Spark itself uses the Dropwizard library for some of its own metrics, so this library integrates with Spark's existing metrics system to report user metrics alongside Spark's built-in metrics.

When a SparkMetric instance is created in an executor or the driver, it sets up a connection to the MetricsReceiver on the driver, which gets set up by the call to UserMetricsSystem.initialize. Whenever a metric is collected (e.g. calling meter.mark(), gauge.set(x), etc.), that value is sent to the MetricsReceiver, which uses those values to update its corresponding stateful Dropwizard Metric instance. A SparkMetric instance is stateless, in that there are no actual values stored there - its only functionality is to send values to the MetricsReceiver. A metric is uniquely identified by its name, so, for example, all values sent by instances of a SparkHistogram named MyHistogram will get aggregated on the MetricsReceiver in a single instance of a Dropwizard Histogram that corresponds to MyHistogram.

Metrics are sent to the MetricsReceiver using a MetricMessage, which contains the actual metric value and metadata about that metric. This metadata contains information that determines how its corresponding Dropwizard metric will be instantiated in the MetricsReceiver. For example, a MetricMessage for a SparkHistogram contains not only the metric value and name, but also the Dropwizard Reservoir class used to determine what kind of windowing behavior the histogram will have. Having this metadata allows for metrics to be created dynamically during runtime, rather than having to define them all beforehand. This can, for example, enable the creation of a Meter which is named after an Exception, where what the Exception instance could be doesn't need to be known ahead of time:

UserMetricsSystem.meter(s"exceptionRate.${exception.getClass.getSimpleName}")

Troubleshooting

  • A NotInitializedException is thrown:

    The most likely reason is that UserMetricsSystem.initialize was not called on the driver before a SparkMetric instance was created. A SparkMetric instance needs to connect to the MetricsReceiver when instantiated, so if initialize was not invoked, there is no MetricsReceiver to connect to. Another likely reason is that the SparkMetric instance was not declared lazy. This is important because, even if initialize was called on the driver, there's no guarantee that the SparkMetric instance will be instantiated on a remote JVM after the MetricsReceiver is set up. The only way to have this guarantee is to delay instantiating the SparkMetric until it is actually used by the application, which means that these need to be lazy. This isn't the most user-friendly API, so future work will aim to not require these lazy declarations.

  • A SparkContextNotFoundException is thrown:

    This can happen if UserMetricsSystem.initialize is called before a SparkContext exists. This error can also happen if a SparkMetric instance isn't declared lazy and is instantiated as a field on the driver singleton object. A broken example:

    object OffsetMigrationTool {
      val myHistogram = UserMetricsSystem.histogram("MyHistogram")
    
      def main(args: Array[String]) {
        val sc = new SparkContext()
        UserMetricsSystem.initialize(sc)
        // Rest of driver code...
      }
    }

    myHistogram above needs to instead be declared lazy:

    lazy val myHistogram = UserMetricsSystem.histogram("MyHistogram")

More Repositories

1

greenscreen

CoffeeScript
1,197
star
2

Selenium-Grid-Extras

Simplify the management of the Selenium Grid Nodes and stabilize said nodes by cleaning up the test environment after the build has been completed
Ruby
538
star
3

DotCi

DotCi Jenkins github integration, .ci.yml http://groupon.github.io/DotCi
Java
500
star
4

sparklint

A tool for monitoring and tuning Spark jobs for efficiency.
Scala
356
star
5

grox

Grox helps to maintain the state of Java / Android apps.
Java
339
star
6

testium

⛔️ [DEPRECATED] see https://github.com/testiumjs/testium-mocha
CoffeeScript
305
star
7

gleemail

Making email template development fun! Sort of!
CoffeeScript
293
star
8

ansible-silo

Ansible in a self-contained environment via Docker.
Shell
203
star
9

ndu

node disk usage
JavaScript
195
star
10

odo

A Mock Proxy Server
Java
151
star
11

cson-parser

Simple & safe CSON parser
JavaScript
132
star
12

FeatureAdapter

FeatureAdapter (FA) is an Android Library providing an optimized way to display complex screens on Android.
Java
113
star
13

dependency-injection-checks

Dependency Injection Usage Checks
Java
97
star
14

node-cached

A simple caching library for node.js, inspired by the Play cache API
JavaScript
94
star
15

luigi-warehouse

A luigi powered analytics / warehouse stack
Python
87
star
16

codeburner

Security-focused static code analysis for everyone
Ruby
83
star
17

gofer

A general purpose service client library for node.js
JavaScript
83
star
18

locality-uuid.java

Java
80
star
19

jesos

Java
51
star
20

swagql

Create a GraphQL schema from swagger spec
JavaScript
46
star
21

quinn

A set of convenient helpers to use promises to handle http requests
JavaScript
40
star
22

robo-remote

RoboRemote is a remote control framework for Robotium. The goal of RoboRemote is to allow for more complex test scenarios by letting the automator write their tests using standard desktop Java/JUnit. All of the Robotium Solo commands are available. RoboRemote also provides some convencience classes to assist in common tasks such as interacting with list views.
Java
40
star
23

webdriver-http-sync

sync http implementation of the WebDriver protocol for Node.js
JavaScript
39
star
24

mysql_slowlogd

Daemon that serves MySQL's slow query log via HTTP as a streaming download
Shell
36
star
25

mongo-deep-mapreduce

Use Hadoop MapReduce directly on Mongo data
Java
30
star
26

tdsql

Run SQL queries against a Teradata data warehouse server
Perl
29
star
27

nlm

Lifecycle manager for node projects
JavaScript
29
star
28

selenium-download

allow downloading of latest selenium standalone server and chromedriver
JavaScript
29
star
29

monsoon

An extensible monitor system that checks java processes and exposes metrics based on them.
Java
28
star
30

backbeat

A workflow service for processing asynchronous tasks across distributed systems
Ruby
28
star
31

Message-Bus

Java
25
star
32

retromock

Like Wiremock for Retrofit, but faster.
Java
24
star
33

report-card

An Open Source Report Card
JavaScript
23
star
34

nakala

Java
22
star
35

assertive

Assertive is a terse yet expressive assertion library
JavaScript
21
star
36

locality-uuid.rb

Ruby
18
star
37

javascript

Guidelines for using Javascript at Groupon
JavaScript
16
star
38

KatMaps

Kotlin
15
star
39

shellot

Slim terminal realtime graphing tool
Ruby
14
star
40

roll

roll - bootstrap or upgrade a Unix host with Roller
C
13
star
41

sycl

Simple YAML Config Library
Ruby
13
star
42

vertx-utils

Java
12
star
43

baryon

Baryon is a library for building Spark Streaming applications that consume data from Kafka.
Scala
11
star
44

params_deserializers

Deserializers for Rails params
Ruby
11
star
45

poller

Poll a URL, and trigger code on changes
Ruby
10
star
46

git-workflow

JavaScript
10
star
47

json-schema-validator

Maven plugin to validate json files against a json schema. Uses https://github.com/fge/json-schema-validator library under the covers
Java
10
star
48

mysql-junit4

Java
9
star
49

vertx-memcache

Java
9
star
50

shared-store

Keeping config data in sync
JavaScript
9
star
51

artemisia

A light-weight configuration driven Data-Integration utility
Scala
8
star
52

pg_consul

C++
8
star
53

vertx-redis

Java
7
star
54

phy

Minimal hyperscript helpers for Preact
JavaScript
6
star
55

mezzanine

Mezzanine is a library built on Spark Streaming used to consume data from Kafka and store it into Hadoop.
Scala
6
star
56

DotCi-Plugins-Starter-Pack

DotCi-Plugins-Starter-Pack - Expansion-pack for DotCi
Java
6
star
57

Novie

Java
5
star
58

backbeat_ruby

A Ruby client for Backbeat workflow service
Ruby
4
star
59

nilo

A dependency injection toolset for building applications
JavaScript
3
star
60

promise

Java
3
star
61

schema-inferer

Scala
2
star
62

two-to-three

Swagger to Open API Converter
Java
2
star
63

assertive-as-promised

extends assertive with promise support
CoffeeScript
2
star
64

jtier-ctx

Java
2
star
65

kmond

Kotlin
2
star
66

api-build-resources

Build related resources files, e.g. checkstyle configs, etc.
2
star
67

tiquette

Have some etiquette. Format your commit messages with a ticket or issue number.
TypeScript
2
star
68

gofer-proxy

Use a `gofer` client as an express middleware
JavaScript
1
star
69

gh-grep

GitHub CLI grep extension
TypeScript
1
star
70

stylint-config-groupon

CSS
1
star
71

coffeelint-config-groupon

CoffeeScript lint setting used at Groupon
JavaScript
1
star
72

installed-package

Run your node tests against an installed version of your package
JavaScript
1
star
73

api-parent-pom

Project to contain parent pom for common plugin configuration across all API team Maven projects.
1
star
74

jdbi-st4

1
star
75

gh-bulk-pr

GitHub CLI bulk-pr extension
TypeScript
1
star