• Stars
    star
    723
  • Rank 62,657 (Top 2 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created almost 7 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Knee point detection in Python 📈

kneed

Knee-point detection in Python

Downloads Downloads Dependents Open in Streamlit Build Status codecovDOI

This repository is an attempt to implement the kneedle algorithm, published here. Given a set of x and y values, kneed will return the knee point of the function. The knee point is the point of maximum curvature.

Table of contents

Installation

kneed has been tested with Python 3.7, 3.8, 3.9, and 3.10.

anaconda

$ conda install -c conda-forge kneed

pip

$ pip install kneed # To install only knee-detection algorithm
$ pip install kneed[plot] # To also install plotting functions for quick visualizations

Clone from GitHub

$ git clone https://github.com/arvkevi/kneed.git && cd kneed
$ pip install -e .

Usage

These steps introduce how to use kneed by reproducing Figure 2 from the manuscript.

Input Data

The DataGenerator class is only included as a utility to generate sample datasets.

Note: x and y must be equal length arrays.

from kneed import DataGenerator, KneeLocator

x, y = DataGenerator.figure2()

print([round(i, 3) for i in x])
print([round(i, 3) for i in y])

[0.0, 0.111, 0.222, 0.333, 0.444, 0.556, 0.667, 0.778, 0.889, 1.0]
[-5.0, 0.263, 1.897, 2.692, 3.163, 3.475, 3.696, 3.861, 3.989, 4.091]

Find Knee

The knee (or elbow) point is calculated simply by instantiating the KneeLocator class with x, y and the appropriate curve and direction.
Here, kneedle.knee and/or kneedle.elbow store the point of maximum curvature.

kneedle = KneeLocator(x, y, S=1.0, curve="concave", direction="increasing")

print(round(kneedle.knee, 3))
0.222

print(round(kneedle.elbow, 3))
0.222

The knee point returned is a value along the x axis. The y value at the knee can be identified:

print(round(kneedle.knee_y, 3))
1.897

Visualize

The KneeLocator class also has two plotting functions for quick visualizations. Note that all (x, y) are transformed for the normalized plots

# Normalized data, normalized knee, and normalized distance curve.
kneedle.plot_knee_normalized()

# Raw data and knee.
kneedle.plot_knee()

Documentation

Documentation of the parameters and a full API reference can be found here.

Interactive

An interactive streamlit app was developed to help users explore the effect of tuning the parameters. There are two sites where you can test out kneed by copy-pasting your own data:

  1. https://share.streamlit.io/arvkevi/ikneed/main/ikneed.py
  2. https://ikneed.herokuapp.com/

You can also run your own version -- head over to the source code for ikneed.

ikneed

Contributing

Contributions are welcome, please refer to CONTRIBUTING to learn more about how to contribute.

Citation

Finding a “Kneedle” in a Haystack: Detecting Knee Points in System Behavior Ville Satopa † , Jeannie Albrecht† , David Irwin‡ , and Barath Raghavan§ †Williams College, Williamstown, MA ‡University of Massachusetts Amherst, Amherst, MA § International Computer Science Institute, Berkeley, CA

More Repositories

1

img2cmap

Create colormaps from images
Python
87
star
2

ezancestry

Easy genetic ancestry predictions in Python
Jupyter Notebook
61
star
3

disarray

Confusion matrix metrics directly from your pandas DataFrame
Python
33
star
4

nba-roster-turnover

Interactive exploration of NBA roster turnover
Python
14
star
5

openhumansimputer

Imputation pipeline for Open Humans
Python
14
star
6

tgviz

1000 Genomes Project population visualizations using dimensionality reduction
Python
11
star
7

clinvar-kaggle

Scripts used to generate the ClinVar conflicting classifications dataset on Kaggle
Python
11
star
8

ikneed

Interactive knee point detection using kneed!
Python
8
star
9

listipy

VS Code Extension to List-ify strings into Python Lists
TypeScript
6
star
10

rprec

Recommendation engine for Real Python content.
Python
6
star
11

tunein-blog

Code to accompany "Tune In: Decision Threshold Optimization with scikit-learn's TunedThresholdClassifier"
Jupyter Notebook
3
star
12

gwas-wrapper

Simple wrapper for the GWAS Catalog
Python
2
star
13

generif2vec

Doc2vec model trained on NCBI Gene Reference into Function (GeneRIF)-annotated PubMed abstracts.
Python
2
star
14

nba_first_round

Explore results from the first round of the NBA playoffs (2003-2018)
Python
1
star
15

hometown_precipitation

Track precipitation in your neighborhood.
Python
1
star
16

clinAnno

Annoate a .vcf file with publicly-available data
Python
1
star
17

biogridpy

Python client for the BioGRID webservice
Jupyter Notebook
1
star
18

rprec-chrome-extension

A chrome extension for the Real Python recommendation engine.
JavaScript
1
star