• Stars
    star
    761
  • Rank 59,698 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Luminaire is a python package that provides ML driven solutions for monitoring time series data.

Luminaire

A hands-off Anomaly Detection Library

PyPI version PyPI - Python Version License build publish docs


Table of contents

What is Luminaire

Luminaire is a python package that provides ML-driven solutions for monitoring time series data. Luminaire provides several anomaly detection and forecasting capabilities that incorporate correlational and seasonal patterns as well as uncontrollable variations in the data over time.

Quick Start

Install Luminaire from PyPI using pip

pip install luminaire

Import luminaire module in python

import luminaire

See Examples to get started. Also, refer to the Luminaire documentation for detailed description of methods and usage.

Time Series Outlier Detection Workflow

Luminaire Flow

Luminaire outlier detection workflow can be divided into 3 major components:

Data Preprocessing and Profiling Component

This component can be called to prepare a time series prior to training an anomaly detection model on it. This step applies a number of methods that make anomaly detection more accurate and reliable, including missing data imputation, identifying and removing recent outliers from training data, necessary mathematical transformations, and data truncation based on recent change points. It also generates profiling information (historical change points, trend changes, etc.) that are considered in the training process.

Profiling information for time series data can be used to monitor data drift and irregular long-term swings.

Modeling Component

This component performs time series model training based on the user-specified configuration OR optimized configuration (see Luminaire hyperparameter optimization). Luminaire model training is integrated with different structural time series models as well as filtering based models. See Luminaire outlier detection for more information.

The Luminaire modeling step can be called after the data preprocessing and profiling step to perform necessary data preparation before training.

Configuration Optimization Component

Luminaire's integration with configuration optimization enables a hands-off anomaly detection process where the user needs to provide very minimal configuration for monitoring any type of time series data. This step can be combined with the preprocessing and modeling for any auto-configured anomaly detection use case. See fully automatic outlier detection for a detailed walkthrough.

Anomaly Detection for High Frequency Time Series

Luminaire can also monitor a set of data points over windows of time instead of tracking individual data points. This approach is well-suited for streaming use cases where sustained fluctuations are of greater concern than individual fluctuations. See anomaly detection for streaming data for detailed information.

Examples

Batch Time Series Monitoring

import pandas as pd
from luminaire.optimization.hyperparameter_optimization import HyperparameterOptimization
from luminaire.exploration.data_exploration import DataExploration

data = pd.read_csv('Path to input time series data')
# Input data should have a time column set as the index column of the dataframe and a value column named as 'raw'

# Optimization
hopt_obj = HyperparameterOptimization(freq='D')
opt_config = hopt_obj.run(data=data)

# Profiling
de_obj = DataExploration(freq='D', **opt_config)
training_data, pre_prc = de_obj.profile(data)

# Identify Model
model_class_name = opt_config['LuminaireModel']
module = __import__('luminaire.model', fromlist=[''])
model_class = getattr(module, model_class_name)

# Training
model_object = model_class(hyper_params=opt_config, freq='D')
success, model_date, trained_model = model_object.train(data=training_data, **pre_prc)

# Scoring
trained_model.score(100, '2021-01-01')

Streaming Time Series Monitoring

import pandas as pd
from luminaire.model.window_density import WindowDensityHyperParams, WindowDensityModel
from luminaire.exploration.data_exploration import DataExploration

data = pd.read_csv('Path to input time series data')
# Input data should have a time column set as the index column of the dataframe and a value column named as 'raw'

# Configuration Specs and Profiling
config = WindowDensityHyperParams().params
de_obj = DataExploration(**config)
data, pre_prc = de_obj.stream_profile(df=data)
config.update(pre_prc)

# Training
wdm_obj = WindowDensityModel(hyper_params=config)
success, training_end, model = wdm_obj.train(data=data)

# Scoring
score, scored_window = model.score(scoring_data)    # scoring_data is data over a time-window instead of a datapoint

Contributing

Want to help improve Luminaire? Check out our contributing documentation.

Citing

Please cite the following article if Luminaire is used for any research purpose or scientific publication:

Chakraborty, S., Shah, S., Soltani, K., Swigart, A., Yang, L., & Buckingham, K. (2020, December). Building an Automated and Self-Aware Anomaly Detection System. In 2020 IEEE International Conference on Big Data (Big Data) (pp. 1465-1475). IEEE. (arxiv link)

Other Useful Resources

  • Chakraborty, S., Shah, S., Soltani, K., & Swigart, A. (2019, December). Root Cause Detection Among Anomalous Time Series Using Temporal State Alignment. In 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA) (pp. 523-528). IEEE. (arxiv link)

Blogs

Development Team

Luminaire is developed and maintained by Sayan Chakraborty, Smit Shah, Kiumars Soltani, Luyao Yang, Anna Swigart, Kyle Buckingham and many other contributors from the Zillow Group A.I. team.

More Repositories

1

react-slider

Accessible, CSS agnostic, slider component for React.
JavaScript
844
star
2

quantile-forest

Quantile Regression Forests compatible with scikit-learn.
Python
194
star
3

zind

Zillow Indoor Dataset
Python
134
star
4

redux-inputs

redux-inputs is a Javascript library that works with redux to validate and store values from inputs and forms.
JavaScript
102
star
5

ctds

Python DB-API 2.0 library for MS SQL Server
Python
83
star
6

godash

Utility functions for searching and manipulating slices in golang. Inspired by the Lodash library in Javascript.
Go
80
star
7

seolint

A node based SEO linting tool
JavaScript
56
star
8

zkafka

An efficient and scalable library for stateless Kafka message processing written in Go
Go
46
star
9

howwegoatzillow

Go
32
star
10

webpack-stats-duplicates

Search your webpack bundle stats.json for duplicate modules
JavaScript
28
star
11

fair-housing-guardrail

Fair Housing Guardrail
Python
28
star
12

hyper-kube-config

H Y P E R K U B E - A Serverless API and kubectl plugin providing a storage and retrieval Kubernetes cluster credentials. Hyperkube leverages AWS Secrets Manager for storing credential information.
Python
28
star
13

create-react-styleguide

A toolkit for creating React component libraries and style guides
JavaScript
20
star
14

as-research

14
star
15

aiographite

aiographite is Python3 library ultilizing asyncio, designed to help Graphite users to send data into graphite easily.
Python
13
star
16

laser

Python
12
star
17

aws-custom-credential-provider

A custom AWS credential provider that allows your Hadoop or Spark application access S3 file system by assuming a role
Java
10
star
18

tycho

A web service for tracking operational changes
Python
8
star
19

python-sqs-logging-handler

Python
8
star
20

intake-nested-yaml-catalog

Supports a single YAML file hierarchical catalog to organize datasets and avoid a data swamp.
Python
7
star
21

zdatasets

Dataset SDK for consistent read/write [batch, online, streaming] data.
Python
6
star
22

schema-dot-org-markup

Lerna repo for packages to support schema.org markup
JavaScript
6
star
23

intake-dal

Dataset abstraction over disparate storage systems (eg: bulk, streaming, serving, ...).
Python
5
star
24

sqs-log4j-handler

A log4j handler for sending log data to AWS SQS queue
Java
5
star
25

mustache-wax

Wrap precompiled Mustache templates (using Handlebars) in YUI modules.
JavaScript
5
star
26

drywall

Accessible, style agnostic component library built on top of styled-components.
JavaScript
4
star
27

buildout-platform-versions

A dependency version management forklift for buildout.
Python
3
star
28

abysmal

Appallingly basic yet somehow mostly adequate language
Python
3
star
29

intake-hive

Intake plugin to read and write to Hive
Python
2
star
30

boost-bundled-vowpalwabbit

This is vowpalwabbit and it's python bindings bundled with necessary boost libraries built on debian jessie
C++
2
star
31

zillow.github.com

2
star
32

config-enhance

Add reuse to ConfigParser style config files.
Python
2
star
33

orbital-core

A python core for orbital applications
Python
2
star
34

drywall-theme-bootstrap

A bootstrap inspired theme for the drywall component library
JavaScript
2
star
35

zlocator

Zillow script driver for yahoo/locator-based YUI builds.
JavaScript
2
star
36

beaut

npm package providing a more robust interface to a tweaked fork of js-beautifier
JavaScript
1
star
37

linuxgateway

VPN concentrator for direct AWS integration
Shell
1
star
38

generator-yui-library

Yeoman generator for YUI
JavaScript
1
star
39

yuigen

npm CLI utility to generate YUI modules and tests from templates
JavaScript
1
star
40

recast-yui

Slice and dice YUI modules, groups, and other bits with recast.
JavaScript
1
star
41

grunt-init-yui

Grunt project scaffolding for YUI.
JavaScript
1
star
42

manikin-model

A JS library for defining clear, reliable, flexible, and enforceable data models.
JavaScript
1
star
43

combine-ergonomics

Swift
1
star
44

taogpt

TaoGPT: From deep learning to deep thinking - A hybrid System 1 and System 2 AI Agent running on top of LLM
Python
1
star
45

psmnet-layout

Python
1
star
46

openapi-tutorial-android

Android client for our OpenAPI tutorial
Kotlin
1
star
47

zfmt

A configure driven marshaller/unmarshaller that handles a number of serialization constructs (json, proto, avro, etc)
Go
1
star