• Stars
    star
    292
  • Rank 142,152 (Top 3 %)
  • Language
    Go
  • License
    Apache License 2.0
  • Created over 10 years ago
  • Updated about 9 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Probabilistic anomaly detection for time series data

Anomalyzer

Build Status GoDoc

Probabilistic anomaly detection in Go.

Featured on Hacker News August 13th, 2015 and the Lytics Developer Blog.

Windows

Inspired by Etsy's Skyline package, Anomalyzer implements a suite of statistical tests that yield the probability that a given set of numeric input, typically a time series, contains anomalous behavior. Each test compares the behavior in an active window of one or more points to the behavior in a reference window of two or more points.

Specifying a number of seasons will yield a reference window length equal to that factor times the length of the active window specified. For example, an input vector of [1, 2, 3, 4, 5, 6, 7, 8, 9], and an active window length of 1 with number of seasons equal to 4, would yield an active window of [9] and a reference window of [5, 6, 7, 8].

Algorithms

Anomalyzer can implement one or more of the following algorithmic tests:

  1. cdf: Compares the differences in the behavior in the active window to the cumulative distribution function of the reference window.
  2. diff: Performs a bootstrap permutation test on the ranks of the differences in both windows, in the flavor of a Wilcoxon rank-sum test.
  3. high rank: Performs a bootstrap permutation test on the ranks of the entries themselves in both windows. Counts how many times the permuted rank-sum is less than the original rank-sum, sensitive to increasing behavior.
  4. low rank: Similarly, counts how many times the permuted rank-sum is greater than the original rank-sum, sensitive to decreasing behavior.
  5. magnitude: Compares the relative magnitude of the difference between the averages of the active and the reference windows.
  6. fence: Indicates that data are approaching a configurable upper and lower bound.
  7. bootstrap ks: Calculates the Kolmogorov-Smirnov test over active and reference windows and compares that value to KS test scores obtained after permuting all elements in the set.

Each test yields a probability of anomalous behavior, and the probabilities are then computed over a weighted mean to determine if the overall behavior is anomalous. Since a probability is returned, the user may determine the sensitivity of the decision, and can determine the threshold for anomalous behavior for the application, whether at say 0.8 for general anomalous behavior or 0.95 for extreme anomalous behavior.

Configuration

Any of the tests can be included in the anomalyzer, and if none are supplied in the configuration, default to magnitude and cdf. Methods are supplied through the Methods value in the configuration and accepts a slice of strings for the method names.

A value for ActiveSizeis required and must be a minimum of 1. The NSeasons will default to 4 if not specified.

Magnitude

If the magnitude test is specified, a Sensitivity (between 0 and 1) can be supplied such that when the result of the magnitude test is less than that value, the weighted mean will return 0. If Sensitivity is not specified, it defaults to 0.1.

Bootstrap KS

To capture seasonality, the bootstrap ks test should consider an active window length equal to a season.

Fence

The fence test can be configured to use custom UpperBound and LowerBound values for the fences. If no lower bound is desired, set the value of LowerBound to anomalyzer.NA.

Diff & Rank

The diff, bootstrap ks, and rank tests can accept a value for the number of bootstrap samples to generate, indicated by PermCount, and defaults to 500 if not set.

Example

package main

import (
	"fmt"
	"github.com/lytics/anomalyzer"
)

func main() {
	conf := &anomalyzer.AnomalyzerConf{
		Sensitivity: 0.1,
		UpperBound:  5,
		LowerBound:  anomalyzer.NA, // ignore the lower bound
		ActiveSize:  1,
		NSeasons:    4,
		Methods:     []string{"diff", "fence", "highrank", "lowrank", "magnitude"},
	}

	// initialize with empty data or an actual slice of floats
	data := []float64{0.1, 2.05, 1.5, 2.5, 2.6, 2.55}

	anom, _ := anomalyzer.NewAnomalyzer(conf, data)

	// the push method automatically triggers a recalcuation of the
	// anomaly probability.  The recalculation can also be triggered
	// by a call to the Eval method.
	prob := anom.Push(8.0)
	fmt.Println("Anomalous Probability:", prob)
}

More Repositories

1

metafora

Distributed long running work system in Go
Go
151
star
2

confl

Config parser for go, modeled after Nginx format, Nice lenient syntax with Comments
Go
139
star
3

cloudstorage

Cloud & local storage unified api (s3, google, azure, sftp, local)
Go
80
star
4

multibayes

Multiclass Naive Bayesian Classification
Go
76
star
5

grid

A library for distributed processing for Go
Go
57
star
6

hll

HyperLogLog++ for Go
Go
43
star
7

base62

base62 (ie, url safe) encoding golang lib
Go
42
star
8

escp

Elasticsearch Copier - Copies ES indexes
Go
40
star
9

impact

Lightweight bootstrap testing for detecting causal impact to timeseries in Go.
Go
17
star
10

lytics

Lytics Command Line Utility
Go
15
star
11

dfa

Deterministic Finite Automata to define computation with labeled states and explicit transitions
Go
13
star
12

cache

In memory concurrent cache data structure for go (golang)
Go
9
star
13

sereno

Sereno is a Go library of recipes for Etcd. Inspired by Netflix's curator for Zookeeper.
Go
9
star
14

toolbucket

A selection of small Go tool kits for anyone to use.
Go
8
star
15

squaredance

Simple task coordination
Go
7
star
16

slackhook

Simple Go client for Slack's Incoming WebHook API
Go
7
star
17

sshtail

Simple utility to multiplex logs from multiple servers over SSH
Go
6
star
18

sample

Weighted sampling in Go
Go
6
star
19

estail

Elasticsearch/Logstash Tailing Tool
Go
5
star
20

qlbridge

A golang expression evaluator & Library to build SQL query engine based functionality.
Go
5
star
21

gowrapmx4j

Golang wrapper for accesssing MX4J HTTP data
Go
5
star
22

go-lytics

Lytics SDK for Go (Golang)
Go
5
star
23

collector-ios

Ios Collector SDK for lytics.io
Objective-C
5
star
24

datemath

Simple library for evaluating ElasticSearch style date expressions.
Go
4
star
25

ordpool

An order-preserving parallel worker pool library for Go
Go
4
star
26

retry

Retry Library for Go
Go
3
star
27

wherefore

Wherefore art thy transferring via network?
Go
3
star
28

inflight

inflight provides primitives for managing sets of inflight messages that are being processed in parallel
Go
3
star
29

LogspoutLoges

A gliderlabs/logspout module for shipping logs straight to Elasticsearch to bypass the necessity for Logstash.
Go
3
star
30

informant

A drop-in solution for visualizing metrics
JavaScript
3
star
31

gentleman

Full-featured, plugin-oriented, composable HTTP client toolkit for Go
Go
3
star
32

pathforajs

Web personalization SDK
JavaScript
2
star
33

lifecycle

Go package that helps with managing service states and shutdown requests
Go
2
star
34

flo

Pre Alpha
Go
2
star
35

gobyairship

Go client for Urban Airship
Go
2
star
36

skewer

Dumb tool for detecting skew between dumb cloud clocks
Go
2
star
37

rgcs

Google Cloud Storage Wrapper for R
R
2
star
38

pathforacss

Boilerplate for generating custom PathforaJS styles.
CSS
2
star
39

analyst

A simple data API abstraction layer
JavaScript
1
star
40

lytics-js

Interact with the Lytics REST API from JavaScript
TypeScript
1
star
41

pathforadocs

HTML
1
star
42

saltfiles

Public Salt-States for lytics, and custom grains, modules, etc
Python
1
star
43

toomanysecrets

Private gist cleaner
Go
1
star
44

pathforajs-examples

1
star
45

quickstart

Quickstart For getting up and running building Visualizations using Lytics.io
JavaScript
1
star
46

etcdlog

Logs etcd events
Go
1
star
47

ghmoveproject

CLI app to move github project from a repo to an org
Go
1
star
48

segml

A dashboard for creating and visualizing Lytics SegmentML models
R
1
star