• Stars
    star
    560
  • Rank 79,541 (Top 2 %)
  • Language
    Jupyter Notebook
  • Created over 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A use-case focused tutorial for time series forecasting with python

⏳ time-series-forecasting-wiki

This repository contains a series of analysis, transforms and forecasting models frequently used when dealing with time series. The aim of this repository is to showcase how to model time series from the scratch, for this we are using a real usecase dataset (Beijing air polution dataset to avoid perfect use cases far from reality that are often present in this types of tutorials. If you want to rerun the notebooks make sure you install al neccesary dependencies, Guide

You can find the more detailed toc on the main notebook

πŸ“‚ Dataset

The dataset used is the Beijing air quality public dataset. This dataset contains polution data from 2014 to 2019 sampled every 10 minutes along with extra weather features such as preassure, temperature etc. We decided to resample the dataset with daily frequency for both easier data handling and proximity to a real use case scenario (no one would build a model to predict polution 10 minutes ahead, 1 day ahead looks more realistic). In this case the series is already stationary with some small seasonalities which change every year #MORE ONTHIS

In order to obtain a exact copy of the dataset used in this tutorial please run the script under datasets/download_datasets.py which will automatically download the dataset and preprocess it for you.

πŸ“š Analysis and transforms

  • Time series decomposition

    • Level
    • Trend
    • Seasonality
    • Noise
  • Stationarity

    • AC and PAC plots
    • Rolling mean and std
    • Dickey-Fuller test
  • Making our time series stationary

    • Difference transform
    • Log scale
    • Smoothing
    • Moving average

πŸ“ Models tested

  • Autoregression (AR)

  • Moving Average (MA)

  • Autoregressive Moving Average (ARMA)

  • Autoregressive integraded moving average (ARIMA)

  • Seasonal autoregressive integrated moving average (SARIMA)

  • Bayesian regression Link

  • Lasso Link

  • SVM Link

  • Randomforest Link

  • Nearest neighbors Link

  • XGBoost Link

  • Lightgbm Link

  • Prophet Link

  • Long short-term memory with tensorflow (LSTM)Link

  • DeepAR

πŸ” Forecasting results

We will devide our results wether the extra features columns such as temperature or preassure were used by the model as this is a huge step in metrics and represents two different scenarios. Metrics used were:

Evaluation Metrics

  • Mean Absolute Error (MAE)
  • Mean Absolute Percentage Error (MAPE)
  • Root Mean Squared Error (RMSE)
  • Coefficient of determination (R2)
Model mae rmse mape r2
EnsembleXG+TF 27.64 40.23 0.42 0.76
EnsembleLIGHT+TF 27.34 39.27 0.42 0.77
EnsembleXG+LIGHT+TF 27.63 39.69 0.44 0.76
EnsembleXG+LIGHT 29.95 42.7 0.52 0.73
Randomforest tunned 40.79 53.2 0.9 0.57
SVM RBF GRID SEARCH 38.57 50.34 0.78 0.62
DeepAR 71.37 103.97 0.96 -0.63
Tensorflow simple LSTM 30.13 43.08 0.42 0.72
Prophet multivariate 38.25 50.45 0.74 0.62
Kneighbors 57.05 80.39 1.08 0.03
SVM RBF 40.81 56.03 0.79 0.53
Lightgbm 30.21 42.76 0.52 0.72
XGBoost 32.13 45.59 0.56 0.69
Randomforest 45.84 59.45 1.03 0.47
Lasso 39.24 54.58 0.71 0.55
BayesianRidge 39.24 54.63 0.71 0.55
Prophet univariate 61.33 83.64 1.26 -0.05
AutoSARIMAX (1, 0, 1),(0, 0, 0, 6) 51.29 71.49 0.91 0.23
SARIMAX 51.25 71.33 0.91 0.23
AutoARIMA (0, 0, 3) 47.01 64.71 1.0 0.37
ARIMA 48.25 66.39 1.06 0.34
ARMA 47.1 64.86 1.01 0.37
MA 49.04 66.2 1.05 0.34
AR 47.24 65.32 1.02 0.36
HWES 52.96 74.67 1.11 0.16
SES 52.96 74.67 1.11 0.16
Yesterdays value 52.67 74.52 1.04 0.16
Naive mean 59.38 81.44 1.32 -0.0

:shipit: Additional resources and literature

Models not tested but that are gaining popularity

There are several models we have not tried in this tutorials as they come from the academic world and their implementation is not 100% reliable, but is worth mentioning them:

  • Neural basis expansion analysis for interpretable time series forecasting (N-BEATS) | link Code
  • ESRRN link Code

Adhikari, R., & Agrawal, R. K. (2013). An introductory study on time series modeling and forecasting [1]
Introduction to Time Series Forecasting With Python [2]
Deep Learning for Time Series Forecasting [3]
The Complete Guide to Time Series Analysis and Forecasting [4]
How to Decompose Time Series Data into Trend and Seasonality [5]

Contributing

Want to see another model tested? Do you have anything to add or fix? I'll be happy to talk about it! Open an issue/PR :)

More Repositories

1

Behavior-Sequence-Transformer-Pytorch

This is a pytorch implementation for the BST model from Alibaba https://arxiv.org/pdf/1905.06874.pdf
Jupyter Notebook
121
star
2

las-pytorch

Listen, Attend and spell model for E2E ASR. Implementation in Pytorch
Python
37
star
3

DeepSpeech-pytorch

Pytorch implementation for DeepSpeech 2.0
Python
24
star
4

DailyQwertee

Telegram bot that sends you qweerty new πŸ‘• every 24hours. Deployed on gcloud
Python
11
star
5

bot-pccomponentes

A bot I used to buy my RTX 3090 off pccomponentes.com
Python
8
star
6

jupyter-lab-docker-rpi

A Docker image to run jupyterlab on your raspberry pi. Tested on rpi3b and rpi4
Dockerfile
6
star
7

MASTER_THESIS

Master thesis in collaboration with H&M
TeX
5
star
8

DataScience-Cheatsheets

Collection of cheatsheetsπŸ“‘ for my daily workload πŸ‘Ύ
5
star
9

gmm-classifier

Python implementation for a Gaussian mixture model classifier!
Python
5
star
10

Dailypepe

A telegram bot that sends you random pepes
Python
5
star
11

ETSINF3

C++
4
star
12

Netflix-RNN-Recommender

Implicit recommender for Netflix data
Python
3
star
13

pepeweb

A web that gives you random 🐸pepes🐸! Fully serverless implementation with gcloud stack ☁️ (appengine & datastore & cloud functions)
Python
3
star
14

CurriculumVitae

My personal Curriculum Vitae!
TeX
2
star
15

voice_gender_recognition

Voice πŸ™…gender recognition πŸ™†β€β™‚οΈ models 90% accuracy.
Jupyter Notebook
1
star
16

dotfiles

Shell
1
star
17

AOC-2019

My submissions for the advent of code 2019
Python
1
star
18

cov19

Repo to analyse data from the Johns Hopkins University repository https://github.com/CSSEGISandData/COVID-19
Jupyter Notebook
1
star
19

TicTacToe-Ia

Python
1
star
20

asr-transformer

Python
1
star
21

dotfiles.old

Config files and autoinstall scripts for Oh-my-zsh, zsh, vscode, vim. Originally forked from carrlos0 dotfiles
Shell
1
star
22

hm-neuralsequencer

Jupyter Notebook
1
star