• Stars
    star
    945
  • Rank 46,774 (Top 1.0 %)
  • Language
    Python
  • License
    MIT License
  • Created over 2 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Visualize large time series data with plotly.py

Plotly-Resampler logo

PyPI Latest Release support-version codecov Downloads PRs Welcome Documentation Testing

plotly_resampler: visualize large sequential data by adding resampling functionality to Plotly figures

Plotly is an awesome interactive visualization library, however it can get pretty slow when a lot of data points are visualized (100 000+ datapoints). This library solves this by downsampling (aggregating) the data respective to the view and then plotting the aggregated points. When you interact with the plot (panning, zooming, ...), callbacks are used to aggregate data and update the figure.

basic example gif

In this Plotly-Resampler demo over 110,000,000 data points are visualized!

Installation

pip pip install plotly-resampler

What is the difference between plotly-resampler figures and plain plotly figures?

plotly-resampler can be thought of as wrapper around plain plotly figures which adds visualization scalability to line-charts by dynamically aggregating the data w.r.t. the front-end view. plotly-resampler thus adds dynamic aggregation functionality to plain plotly figures.

Important to know:

  • show always returns a static html view of the figure, i.e., no dynamic aggregation can be performed on that view.

  • To have dynamic aggregation:

    • with FigureResampler, you need to call show_dash (or output the object in a cell via IPython.display) -> which spawns a dash-web app, and the dynamic aggregation is realized with dash callback.
    • with FigureWidgetResampler, you need to use IPython.display on the object, which uses widget-events to realize dynamic aggregation (via the running IPython kernel).

Other changes of plotly-resampler figures w.r.t. vanilla plotly:

  • double-clicking within a line-chart area does not Reset Axes, as it results in an β€œAutoscale” event. We decided to implement an Autoscale event as updating your y-range such that it shows all the data that is in your x-range.
    • Note: vanilla Plotly figures their Autoscale result in Reset Axes behavior, in our opinion this did not make a lot of sense. It is therefore that we have overriden this behavior in plotly-resampler.

Features πŸŽ‰

  • Convenient to use:
    • just add either
      • register_plotly_resampler function to your notebook with the best suited mode argument.
      • FigureResampler decorator around a plotly Figure and call .show_dash()
      • FigureWidgetResampler decorator around a plotly Figure and output the instance in a cell
    • allows all other plotly figure construction flexibility to be used!
  • Environment-independent
    • can be used in Jupyter, vscode-notebooks, Pycharm-notebooks, Google Colab, DataSpell, and even as application (on a server)
  • Interface for various aggregation algorithms:
    • ability to develop or select your preferred sequence aggregation method

Usage

Add dynamic aggregation to your plotly Figure (unfold your fitting use case)

  • πŸ€– Automatically (minimal code overhead):

    Use the register_plotly_resampler function
    1. Import and call the register_plotly_resampler method
    2. Just use your regular graph construction code
    • code example:
      import plotly.graph_objects as go; import numpy as np
      from plotly_resampler import register_plotly_resampler
      
      # Call the register function once and all Figures/FigureWidgets will be wrapped
      # according to the register_plotly_resampler its `mode` argument
      register_plotly_resampler(mode='auto')
      
      x = np.arange(1_000_000)
      noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000
      
      
      # auto mode: when working in an IPython environment, this will automatically be a 
      # FigureWidgetResampler else, this will be an FigureResampler
      f = go.Figure()
      f.add_trace({"y": noisy_sin + 2, "name": "yp2"})
      f

    Note: This wraps all plotly graph object figures with a FigureResampler | FigureWidgetResampler. This can thus also be used for the plotly.express interface. πŸŽ‰

  • πŸ‘· Manually (higher data aggregation configurability, more speedup possibilities):

    • Within a jupyter environment without creating a web application
      1. wrap the plotly Figure with FigureWidgetResampler
      2. output the FigureWidgetResampler instance in a cell
      import plotly.graph_objects as go; import numpy as np
      from plotly_resampler import FigureResampler, FigureWidgetResampler
      
      x = np.arange(1_000_000)
      noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000
      
      # OPTION 1 - FigureWidgetResampler: dynamic aggregation via `FigureWidget.layout.on_change`
      fig = FigureWidgetResampler(go.Figure())
      fig.add_trace(go.Scattergl(name='noisy sine', showlegend=True), hf_x=x, hf_y=noisy_sin)
      
      fig
    • Using a web-application with dash callbacks
      1. wrap the plotly Figure with FigureResampler
      2. call .show_dash() on the Figure
      import plotly.graph_objects as go; import numpy as np
      from plotly_resampler import FigureResampler, FigureWidgetResampler
      
      x = np.arange(1_000_000)
      noisy_sin = (3 + np.sin(x / 200) + np.random.randn(len(x)) / 10) * x / 1_000
      
      # OPTION 2 - FigureResampler: dynamic aggregation via a Dash web-app
      fig = FigureResampler(go.Figure())
      fig.add_trace(go.Scattergl(name='noisy sine', showlegend=True), hf_x=x, hf_y=noisy_sin)
      
      fig.show_dash(mode='inline')

    Tip πŸ’‘: For significant faster initial loading of the Figure, we advise to wrap the constructor of the plotly Figure and add the trace data as hf_x and hf_y


Note: Any plotly Figure can be wrapped with FigureResampler and FigureWidgetResampler! πŸŽ‰ But, (obviously) only the scatter traces will be resampled.

Important considerations & tips

  • When running the code on a server, you should forward the port of the FigureResampler.show_dash() method to your local machine.
    Note that you can add dynamic aggregation to plotly figures with the FigureWidgetResampler wrapper without needing to forward a port!
  • The FigureWidgetResampler uses the IPython main thread for its data aggregation functionality, so when this main thread is occupied, no resampling logic can be executed. For example; if you perform long computations within your notebook, the kernel will be occupied during these computations, and will only execute the resampling operations that take place during these computations after finishing that computation.
  • In general, when using downsampling one should be aware of (possible) aliasing effects. The [R] in the legend indicates when the corresponding trace is being resampled (and thus possibly distorted) or not. Additionally, the ~<range> suffix represent the mean aggregation bin size in terms of the sequence index.
  • The plotly autoscale event (triggered by the autoscale button or a double-click within the graph), does not reset the axes but autoscales the current graph-view of plotly-resampler figures. This design choice was made as it seemed more intuitive for the developers to support this behavior with double-click than the default axes-reset behavior. The graph axes can ofcourse be resetted by using the reset_axis button. If you want to give feedback and discuss this further with the developers, see issue #49.

Citation and papers

The paper about the plotly-resampler toolkit itself (preprint): https://arxiv.org/abs/2206.08703

@inproceedings{van2022plotly,
  title={Plotly-resampler: Effective visual analytics for large time series},
  author={Van Der Donckt, Jonas and Van Der Donckt, Jeroen and Deprost, Emiel and Van Hoecke, Sofie},
  booktitle={2022 IEEE Visualization and Visual Analytics (VIS)},
  pages={21--25},
  year={2022},
  organization={IEEE}
}

Related papers:



πŸ‘€ Jonas Van Der Donckt, Jeroen Van Der Donckt, Emiel Deprost

More Repositories

1

tsflex

Flexible time series feature extraction & processing
Python
351
star
2

powershap

A power-full Shapley feature selection method.
Python
172
star
3

tsdownsample

High-performance time series downsampling algorithms for visualization
Jupyter Notebook
121
star
4

RR-GCN

Code for "R-GCN: The R Could Stand for Random"
Python
36
star
5

sleep-linear

Do not sleep on traditional machine learning for sleep stage scoring
Jupyter Notebook
35
star
6

seriesdistancematrix

Python
27
star
7

ts-datapoint-selection-vis

Data Point Selection for Line Chart Visualization: analysis notebooks and implementation details
HTML
13
star
8

trace-updater

Dash component to update a dcc.Graph its traces via callbacks
JavaScript
11
star
9

MinMaxLTTB

MinMax-preselection for Efficient Time Series Line Chart Visualization (using LTTB)
HTML
8
star
10

causalteshap

Jupyter Notebook
7
star
11

tsflex-benchmarking

HTML
6
star
12

VisCARS

VisCARS: Graph-Based Context-Aware Visualization Recommendation System
Jupyter Notebook
5
star
13

cmc-learner

Implementation of Conformal Monte Carlo (CMC) learner
Jupyter Notebook
5
star
14

class-balancing-paper

Jupyter Notebook
4
star
15

The-Distribution-Coverage-Loss

Jupyter Notebook
3
star
16

ddashboard-ontology

Ontology for the Dynamic Dashboard
3
star
17

REACT

Jupyter Notebook
3
star
18

landmarker

PyTorch-based toolkit for landmark detection
Python
3
star
19

svd-kernels

Repository for code regarding the paper "Parameter-efficient neural networks with singular value decomposed kernels"
Jupyter Notebook
3
star
20

gssp_analysis

Analysis notebooks and scripts of the gssp web app data collection
Jupyter Notebook
2
star
21

DAHCC-Sources

Resource files for all ontologies described at https://dahcc.idlab.ugent.be
Python
2
star
22

gssp_web_app

Web application to acquire picture description speech data according to the GSSP
HTML
2
star
23

plotly-resampler-benchmarks

Jupyter Notebook
2
star
24

obelisk-python

Python client for the Obelisk API
Python
2
star
25

data-quality-challenges-wearables

Addressing Data Quality Challenges in Observational Ambulatory Studies: Analysis, methodologies and practical solutions for wrist-worn wearable monitoring
Jupyter Notebook
2
star
26

webthing-client-python

Python client for Dynamic Dashboard Webthings
Python
1
star
27

atrial_fibrillation_prediction

Jupyter Notebook
1
star
28

phys-ml-leak-localization

Jupyter Notebook
1
star