• Stars
    star
    152
  • Rank 244,685 (Top 5 %)
  • Language
    Java
  • License
    Other
  • Created almost 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Sherlock is an anomaly detection service built on top of Druid

Sherlock: Anomaly Detector

build Release Artifacts Snapshot Artifacts Coverage Status GPL 3.0

Table of Contents

Introduction to Sherlock

Sherlock is an anomaly detection service built on top of Druid. It leverages EGADS (Extensible Generic Anomaly Detection System) to detect anomalies in time-series data. Users can schedule jobs on an hourly, daily, weekly, or monthly basis, view anomaly reports from Sherlock's interface, or receive them via email.

Components

  1. Timeseries Generation
  2. EGADS Anomaly Detection
  3. Redis database
  4. UI in Spark Java

Detailed Description

Timeseries Generation

Timeseries generation is the first phase of Sherlock's anomaly detection. The user inputs a full Druid JSON query with a metric name and group-by dimensions. Sherlock validates the query, adjusts the time interaval and granularity based on the EGADS config, and makes a call to Druid. Druid responds with an array of time-series, which are parsed into EGADS time-series.

Sample Druid Query:

{
  "metric": "metric(metric1/metric2)", 
  "aggregations": [
    {
      "filter": {
        "fields": [
          {
            "type": "selector", 
            "dimension": "dim1", 
            "value": "value1"
          }
        ], 
        "type": "or"
      }, 
      "aggregator": {
        "fieldName": "metric2", 
        "type": "longSum", 
        "name": "metric2"
      }, 
      "type": "filtered"
    }
  ], 
  "dimension": "groupByDimension", 
  "intervals": "2017-09-10T00:00:01+00:00/2017-10-12T00:00:01+00:00", 
  "dataSource": "source1", 
  "granularity": {
    "timeZone": "UTC", 
    "type": "period", 
    "period": "P1D"
  }, 
  "threshold": 50, 
  "postAggregations": [
    {
      "fields": [
        {
          "fieldName": "metric1", 
          "type": "fieldAccess", 
          "name": "metric1"
        }
      ], 
      "type": "arithmetic", 
      "name": "metric(metric1/metric2)", 
      "fn": "/"
    }
  ], 
  "queryType": "topN"
}

Sample Druid Response:

[ {
  "timestamp" : "2017-10-11T00:00:00.000Z",
  "result" : [ {
    "groupByDimension" : "dim1",
    "metric(metric1/metric2)" : 8,
    "metric1" : 128,
    "metric2" : 16
  }, {
    "groupByDimension" : "dim2",
    "metric(metric1/metric2)" : 4.5,
    "metric1" : 42,
    "metric2" : 9.33
  } ]
}, {
  "timestamp" : "2017-10-12T00:00:00.000Z",
  "result" : [ {
    "groupByDimension" : "dim1",
    "metric(metric1/metric2)" : 9,
    "metric1" : 180,
    "metric2" : 20
  }, {
    "groupByDimension" : "dim2",
    "metric(metric1/metric2)" : 5.5,
    "metric1" : 95,
    "metric2" : 17.27
  } ]
} ]

EGADS Anomaly Detection

Sherlock calls the user-configured EGADS API for each generated time-series, generates anomaly reports from the response, and stores these reports in a database. Users may also elect to receive anomaly reports by email.

Redis Database

Sherlock uses a Redis backend Redis to store job metadata, generated anomaly reports, among other information, and as a persistent job queue. Keys related to Reports have retention policy. Hourly job reports have retention of 14 days and daily/weekly/monthly job reports have 1 year of retention.

Sherlock UI

Sherlock's user interface is built with Spark. The UI enables users to submit instant anomaly analyses, create and launch detection jobs, view anomalies on a heatmap, and on a graph.

Building Sherlock

A Makefile is provided with all build targets.

Building the JAR

make jar

This creates sherlock.jar in the target/ directory.

How to run

Sherlock is run through the commandline with config arguments.

java -Dlog4j.configuration=file:${path_to_log4j}/log4j.properties \
      -jar ${path_to_jar}/sherlock.jar \
      --version $(VERSION) \
      --project-name $(PROJECT_NAME) \
      --port $(PORT) \
      --enable-email \
      --failure-email $(FAILURE_EMAIL) \
      --from-mail $(FROM_MAIL) \
      --reply-to $(REPLY_TO) \
      --smtp-host $(SMTP_HOST) \
      --interval-minutes $(INTERVAL_MINUTES) \
      --interval-hours $(INTERVAL_HOURS) \
      --interval-days $(INTERVAL_DAYS) \
      --interval-weeks $(INTERVAL_WEEKS) \
      --interval-months $(INTERVAL_MONTHS) \
      --egads-config-filename $(EGADS_CONFIG_FILENAME) \
      --redis-host $(REDIS_HOSTNAME) \
      --redis-port $(REDIS_PORT) \
      --execution-delay $(EXECUTION_DELAY) \
      --timeseries-completeness $(TIMESERIES_COMPLETENESS)

CLI args usage

args required default description
--help - false help
--config - null config
--version - v0.0.0 version
--egads-config-filename - provided egads-config-filename
--port - 4080 port
--interval-minutes - 180 interval-minutes
--interval-hours - 672 interval-hours
--interval-days - 28 interval-days
--interval-weeks - 12 interval-weeks
--interval-months - 6 interval-months
--enable-email - false enable-email
--from-mail if email enabled from-mail
--reply-to if email enabled reply-to
--smtp-host if email enabled smtp-host
--smtp-port - 25 smtp-port
--smtp-user - smtp-user
--smtp-password - smtp-password
--failure-email if email enabled failure-email
--execution-delay - 30 execution-delay
--valid-domains - null valid-domains
--redis-host - 127.0.0.1 redis-host
--redis-port - 6379 redis-port
--redis-ssl - false redis-ssl
--redis-timeout - 5000 redis-timeout
--redis-password - - redis-password
--redis-clustered - false redis-clustered
--project-name - - project-name
--external-file-path - - external-file-path
--debug-mode - false debug-mode
--timeseries-completeness - 60 timeseries-completeness
--http-client-timeout - 20000 http-client-timeout
--backup-redis-db-path - null backup-redis-db-path
--druid-brokers-list-file - null druid-brokers-list-file
--truststore-path - null truststore-path
--truststore-type - jks truststore-type
--truststore-password - null truststore-password
--keystore-path - null keystore-path
--keystore-type - jks keystore-type
--keystore-password - null keystore-password
--key-dir - null key-dir
--cert-dir - null cert-dir
--https-hostname-verification - true https-hostname-verification
--custom-ssl-context-provider-class - DefaultSslContextProvider custom-ssl-context-provider-class
--custom-secret-provider-class - DefaultSecretProvider custom-secret-provider-class
--prophet-url - 127.0.0.1:4080 prophet-url
--prophet-timeout - 120000 prophet-timeout
--prophet-principal - prophet-principal prophet-principal

help

Prints commandline argument help message.

config

Path to a Sherlock configuration file, where the above configuration may be specified. Config arguments in the file override commandline arguments.

version

Version of sherlock.jar to display on the UI

egads-config-filename

Path to a custom EGADS configuration file. If none is specified, the default configuration is used.

port

Port on which to host the Spark application.

interval-minutes

Number of historic data points to use for detection on time-series every minute.

interval-hours

Number of historic data points to use for detection on hourly time-series.

interval-days

Number of historic data points to use for detection on daily time-series.

interval-weeks

Number of historic data points to use for detection on weekly time-series.

interval-months

Number of historic data points to use for detection on monthly time-series.

enable-email

Enable the email service. This enables users to receive email anomaly report notifications.

from-mail

The handle's FROM email displayed to email recipients.

reply-to

The handle's REPLY TO email where replies will be sent.

smtp-host

The email service's SMTP HOST.

smtp-port

The email service's SMTP PORT. The default value is 25.

smtp-user

The email service's SMTP USER.

smtp-password

The email service's SMTP PASSWORD.

failure-email

A dedicated email which may be set to receive job failure notifications.

execution-delay

Sherlock periodically pings Redis to check scheduled jobs. This sets the ping delay in seconds. Jobs are scheduled with a precision of one minute.

valid-domains

A comma-separated list of valid domains to receive emails, e.g. 'yahoo,gmail,hotmail'. If specified, Sherlock will restrict who may receive emails.

redis-host

The Redis backend hostname.

redis-port

The Redis backend port.

redis-ssl

Whether Sherlock should connect to Redis via SSL.

redis-timeout

The Redis connection timeout.

redis-password

The password to use when authenticating to Redis.

redis-clustered

Whether the Redis backend is a cluster.

project-name

Name of the project to display on UI.

external-file-path

Specify the path to external files for Spark framework via this argument.

debug-mode

Debug mode enables debug routes. Ex. '/DatabaseJson' (shows redis data as json dump). Look at com.yahoo.sherlock.App for more details.

timeseries-completeness

This defines minimum fraction of datapoints needed in the timeseries to consider it as a valid timeseries o/w sherlock ignores such timeseries. (default value 60 i.e. 0.6 in fraction)

http-client-timeout

HttpClient timeout can be configured using this(in millis). (default value 20000)

backup-redis-db-path

Backup redis DB at given file path as json dump of indices and objects. Backup is done per day at midnight. Default this parameter is null i.e. no buckup. However, BGSAVE command is run at midnight to save redis local dump.

druid-brokers-list-file

Specify the path to an access control list file of permitted druid broker hosts for querying. Format: <host1>:<port>,<host2>:<port>... (default null i.e any host is allowed)

truststore-path

Path to specify truststore location for mTLS connections. (default null)

truststore-type

Param to specify truststore type for mTLS connections. (default jks)

truststore-password

Param to specify truststore password for mTLS connections. (default null)

keystore-path

Path to specify keystore location for mTLS connections. (default null)

keystore-type

Param to specify keystore type for mTLS connections. (default jks)

keystore-password

Param to specify keystore password for mTLS connections. (default null)

key-dir

Param to specify key directory containing multiple keys(for different clusters) for mTLS connections (default null). This is used when Principal Name is given in druid cluster form. It looks for filename containing Principal Name under this dir. If --key-dir and --cert-dir values are same then the filename should also contain the identifier key for private key file and cert for public key file.

cert-dir

Param to specify cert directory containing multiple certs(for different clusters) for mTLS connections (default null)." This is used when Principal Name is given in druid cluster form. It looks for file name containing Principal Name under this dir. If --key-dir and --cert-dir values are same then the filename should also contain the identifier key for private key file and cert for public key file.

https-hostname-verification

Param to enable/disable https hostname verification for mTLS connections. (default true i.e. hostname verification enabled)

custom-ssl-context-provider-class

Param to specify custom ssl context provider class for mTLS connections. (default com.yahoo.sherlock.utils.DefaultSslContextProvider which returns SSLContext with validation)

custom-secret-provider-class

Param to specify custom secret provider class for passwords. (default com.yahoo.sherlock.utils.DefaultSecretProvider which returns secrets specified from CLISettings)

prophet-url

API endpoint of a running Prophet Service. (default 127.0.0.1:4080 which include both url and port)

prophet-timeout

Timeout for querying the Prophet Service. (default 120000 milliseconds)

prophet-principal

The Kubernetes principal that the Prophet Service is located. (default prophet-principal)

Getting started

It is suggested to use Java8 and Maven 3.3 to develop Sherlock.

Further Development

Adding a new anomaly detector to Sherlock

Currently, Sherlock supports two detector pipelines (Egads/Prophet). Both pipelines use Egads' anomaly detection module for anomaly detection. The Egads pipeline conducts both time series forecasting and anomaly detection via Egads anomaly detection library. On the other hand, the Prophet pipeline allows Sherlock to query forecasted time series from a Prophet web service. After that, the Prophet pipeline performs anomaly detection via Egads' anomaly detection module. If the developer wants to add a new anomaly detector to Sherlock, the developer should look at the abstract class service/DetectorAPIService.java, and implement a new detector class that extends DetectorAPIService. More specifically, developers should implement abstract methods detectAnomaliesAndForecast and detectAnomalies. The two abstract methods are elaborated in sections below.

Developing the instant detection feature

Sherlock allows the user to perform an instant anomaly detection, which is accessible via the /Flash-Query endpoint. The endpoint is linked to method processInstantAnomalyJob under Routes.java, which calls method detectWithResults under DetectorService.java. Method detectWithResults checks which detector the user wants to use, assign the corresponding DetectorAPIService instance, and calls the instance's detectAnomaliesAndForecast method. Method detectAnomaliesAndForecast does anomaly detection and returns the original time series, expected time series, and the anomaly points. The combined results are displayed via the /Flash-Query/ProcessAnomalyReport endpoint.

Developing the Job Scheduling feature

Sherlock allows the user to schedule anomaly detection jobs that run routinely. Regarding the job scheduling, Sherlock uses JobScheduler.java to maintain a Priority Queue stored in Redis. Every time the user adds a job, Sherlock puts the job into via method scheduleJob with the job's next run time as the priority. Sherlock keeps checking the current system time, and pops the Priority Queue as required via method consumeAndExecuteTasks. For the actual detection, method consumeAndExecuteTasks executes a job that is due, which eventually goes to method runDetection under DetectorService.java. Method runDetection checks which detector the user wants to use, assign the corresponding DetectorAPIService instance, and calls the instance's detectAnomalies method. Method detectAnomalies does anomaly detection and returns anomaly points because job reports display only detected anomaly points.

Understanding TimeSeries/Anomaly format used in Sherlock

All current pipelines use TimeSeries and Anomaly classes defined in Egads heavily. To gain a better understanding of those formats, developers should read TimeSeries.java/Anomaly.java defined in the Egads repository.

Committers

Jigar Patel, [email protected]

Jeff Niu, [email protected]

Contributors

Josh Walters, [email protected]

Stephan Stiefel, Stephan3555

Han Xu, hanxu12

License

Code licensed under the GPL v3 License. See LICENSE file for terms.

More Repositories

1

CMAK

CMAK is a tool for managing Apache Kafka clusters
Scala
11,825
star
2

open_nsfw

Not Suitable for Work (NSFW) classification using deep neural network Caffe models.
Python
5,852
star
3

TensorFlowOnSpark

TensorFlowOnSpark brings TensorFlow programs to Apache Spark clusters.
Python
3,873
star
4

serialize-javascript

Serialize JavaScript to a superset of JSON that includes regular expressions and functions.
JavaScript
2,804
star
5

gryffin

Gryffin is a large scale web security scanning platform.
Go
2,075
star
6

fluxible

A pluggable container for universal flux applications.
JavaScript
1,815
star
7

AppDevKit

AppDevKit is an iOS development library that provides developers with useful features to fulfill their everyday iOS app development needs.
Objective-C
1,442
star
8

mysql_perf_analyzer

MySQL performance monitoring and analysis.
Java
1,436
star
9

squidb

SquiDB is a SQLite database library for Android and iOS
Java
1,312
star
10

react-stickynode

A performant and comprehensive React sticky component.
JavaScript
1,266
star
11

CaffeOnSpark

Distributed deep learning on Hadoop and Spark clusters.
Jupyter Notebook
1,266
star
12

blink-diff

A lightweight image comparison tool.
JavaScript
1,191
star
13

egads

A Java package to automatically detect anomalies in large scale time-series data
Java
1,173
star
14

elide

Elide is a Java library that lets you stand up a GraphQL/JSON-API web service with minimal effort.
Java
1,003
star
15

vssh

Go Library to Execute Commands Over SSH at Scale
Go
952
star
16

webseclab

set of web security test cases and a toolkit to construct new ones
Go
915
star
17

kubectl-flame

Kubectl plugin for effortless profiling on kubernetes
Go
784
star
18

streaming-benchmarks

Benchmarks for Low Latency (Streaming) solutions including Apache Storm, Apache Spark, Apache Flink, ...
Jupyter Notebook
630
star
19

redislite

Redis in a python module.
Python
577
star
20

lopq

Training of Locally Optimized Product Quantization (LOPQ) models for approximate nearest neighbor search of high dimensional data in Python and Spark.
Python
563
star
21

HaloDB

A fast, log structured key-value store.
Java
497
star
22

hecate

Automagically generate thumbnails, animated GIFs, and summaries from videos
C++
477
star
23

fetchr

Universal data access layer for web applications.
JavaScript
447
star
24

storm-yarn

Storm-yarn enables Storm clusters to be deployed into machines managed by Hadoop YARN.
Java
417
star
25

react-i13n

A performant, scalable and pluggable approach to instrumenting your React application.
JavaScript
382
star
26

FEL

Fast Entity Linker Toolkit for training models to link entities to KnowledgeBase (Wikipedia) in documents and queries.
Java
335
star
27

monitr

A Node.js process monitoring tool.
C++
312
star
28

Oak

A Scalable Concurrent Key-Value Map for Big Data Analytics
Java
267
star
29

TDOAuth

A BSD-licensed single-header-single-source OAuth1 implementation.
Swift
249
star
30

routr

A component that provides router related functionalities for both client and server.
JavaScript
246
star
31

mysql_partition_manager

MySQL Partition Manager
SQLPL
212
star
32

l3dsr

Direct Server Return load balancing across Layer 3 boundaries.
Shell
193
star
33

dnscache

dnscache for Node
JavaScript
184
star
34

object_relation_transformer

Implementation of the Object Relation Transformer for Image Captioning
Python
176
star
35

fili

Easily make RESTful web services for time series reporting with Big Data analytics engines like Druid and SQL Databases.
Java
173
star
36

check-log4j

To determine if a host is vulnerable to log4j CVE‐2021‐44228
Shell
172
star
37

YMTreeMap

High performance Swift treemap layout engine for iOS and macOS.
Swift
134
star
38

maha

A framework for rapid reporting API development; with out of the box support for high cardinality dimension lookups with druid.
Scala
129
star
39

covid-19-data

COVID-19 datasets are constructed entirely from primary (government and public agency) sources
109
star
40

subscribe-ui-event

Subscribe-ui-event provides a cross-browser and performant way to subscribe to browser UI Events.
JavaScript
109
star
41

jafar

🌟!(Just another form application renderer)
JavaScript
109
star
42

panoptes

A Global Scale Network Telemetry Ecosystem
Python
99
star
43

reginabox

Registry In A Box
JavaScript
97
star
44

preceptor

Test runner and aggregator
JavaScript
85
star
45

hive-funnel-udf

Hive UDFs for funnel analysis
Java
85
star
46

graphkit

A lightweight Python module for creating and running ordered graphs of computations.
Python
84
star
47

SparkADMM

Generic Implementation of Consensus ADMM over Spark
Python
83
star
48

react-cartographer

Generic component for displaying Yahoo / Google / Bing maps.
JavaScript
82
star
49

storm-perf-test

A simple storm performance/stress test
Java
76
star
50

UDPing

UDPing measures latency and packet loss across a link.
C++
75
star
51

bgjs

TypeScript
67
star
52

ycb

A multi-dimensional configuration library that builds bundles from resource files describing a variety of values.
JavaScript
66
star
53

ariel

Ariel is an AWS Lambda designed to collect, analyze, and make recommendations about Reserved Instances for EC2.
Python
64
star
54

YMCache

YMCache is a lightweight object caching solution for iOS and Mac OS X that is designed for highly parallel access scenarios.
Objective-C
63
star
55

validatar

Functional testing framework for Big Data pipelines.
Java
58
star
56

imapnio

Java imap nio client that is designed to scale well for thousands of connections per machine and reduce contention when using large number of threads and cpus.
Java
55
star
57

serviceping

A ping like utility for tcp services
Python
52
star
58

proxy-verifier

Proxy Verifier is an HTTP replay tool designed to verify the behavior of HTTP proxies. It builds a verifier-client binary and a verifier-server binary which each read a set of YAML or JSON files that specify the HTTP traffic for the two to exchange.
C++
45
star
59

express-busboy

A simple body-parser like module for express that uses connect-busboy under the hood.
JavaScript
45
star
60

covid-19-api

Yahoo Knowledge COVID-19 API provides JSON-API and GraphQL interfaces to access COVID-19 publicly sourced data
JavaScript
45
star
61

covid-19-dashboard

Source code for the Yahoo Knowledge Graph COVID-19 Dashboard
JavaScript
42
star
62

photo-background-generation

Jupyter Notebook
41
star
63

yql-plus

The YQL+ parser, execution engine, and source SDK.
Java
40
star
64

panoptes-stream

A cloud native distributed streaming network telemetry.
Go
40
star
65

context-parser

A robust HTML5 context parser that parses HTML 5 web pages and reports the execution context of each character.
HTML
40
star
66

FmFM

Python
39
star
67

cocoapods-blocklist

A CocoaPods plugin used to check a project against a list of pods that you do not want included in your build. Security is the primary use, but keeping specific pods that have conflicting licenses is another possible use.
Ruby
39
star
68

ember-gridstack

Ember components to build drag-and-drop multi-column grids powered by gridstack.js
JavaScript
37
star
69

k8s-namespace-guard

K8s - Admission controller for guarding namespace
Go
35
star
70

VerizonVideoPartnerSDK-controls-ios

Public iOS implementation of the OneMobileSDK default custom controls interface... demonstrating how customers can implement their own custom video player controls.
Swift
35
star
71

SubdomainSleuth

Scanner to identify dangling DNS records and subdomain takeovers
Go
34
star
72

fluxible-action-utils

Utility methods to aid in writing actions for fluxible based applications.
JavaScript
34
star
73

parsec

A collection of libraries and utilities to simplify the process of building web service applications.
Java
34
star
74

mod_statuspage

Simple express/connect middleware to provide a status page with following details of the nodejs host.
JavaScript
32
star
75

bftkv

A distributed key-value storage that's tolerant to Byzantine fault.
JavaScript
30
star
76

spivak

Python
30
star
77

protractor-retry

Use protractor features to automatically re-run failed tests with a specific configurable number of attempts.
JavaScript
28
star
78

cubed

Data Mart As A Service
Java
27
star
79

jsx-test

An easy way to test your React Components (`.jsx` files).
JavaScript
27
star
80

ycb-java

YCB Java
Java
27
star
81

fluxible-immutable-utils

A mixin that provides a convenient interface for using Immutable.js inside react components.
JavaScript
25
star
82

maaf

Modality-Agnostic Attention Fusion for visual search with text feedback
Python
25
star
83

node-limits

Simple express/connect middleware to set limit to upload size, set request timeout etc.
JavaScript
24
star
84

GitHub-Security-Alerts-Workflow

Automation to Incorporate GitHub Security Alerts Into your Business Workflow
Python
23
star
85

bandar-log

Monitoring tool to measure flow throughput of data sources and processing components that are part of Data Ingestion and ETL pipelines.
Scala
21
star
86

fumble

Simple error objects in node. Created specifically to be used with https://github.com/yahoo/fetchr and based on https://github.com/hapijs/boom
JavaScript
21
star
87

SongbirdCharts

Allows for other apps to render accessible audio charts
Kotlin
21
star
88

express-csp

Express extension for Content Security Policy
JavaScript
19
star
89

elide-js

Elide is a library that makes it easy to talk to a JSON API compliant backend.
JavaScript
18
star
90

Zake

A python package that works to provide a nice set of testing utilities for the kazoo library.
Python
18
star
91

npm-auto-version

Automatically generate new NPM versions based on Git tags when publishing
JavaScript
18
star
92

httpmi

An HTTP proxy for IPMI commands.
Python
17
star
93

hodman

Selenium object library
JavaScript
17
star
94

elide-spring-boot-example

Spring Boot example using the Elide framework.
Java
17
star
95

cerebro

JavaScript
17
star
96

Override

In app feature flag management
Swift
16
star
97

ychaos

YChaos - The Resilience Framework by Yahoo!
Python
16
star
98

parsec-libraries

Tools to simplify deploying web services with Parsec.
Java
16
star
99

NetCHASM

An Automated health checking and server status verification system.
C++
14
star
100

k8s-ingress-claim

An admission control policy that safeguards against accidental duplicate claiming of Hosts/Domains.
Go
14
star