• Stars
    star
    1,097
  • Rank 41,956 (Top 0.9 %)
  • Language
    Python
  • License
    MIT License
  • Created 11 months ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official implementation for "iTransformer: Inverted Transformers Are Effective for Time Series Forecasting" (ICLR 2024 Spotlight), https://openreview.net/forum?id=JePfAI8fah

iTransformer

The repo is the official implementation for the paper: iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. It currently includes code implementations for the following tasks:

Multivariate Forecasting: We provide all scripts as well as datasets for the reproduction of forecasting results in this repo.

Boosting Forecasting of Transformers: iTransformer framework can consistently promote Transformer variants, and take advantage of the booming efficient attention mechanisms.

Generalization on Unseen Variates: iTransformer is demonstrated to generalize well on unseen time series, making it a nice alternative as the fundamental backbone of the large time series model.

Better Utilization of Lookback Windows: While Transformer does not necessarily benefit from the larger lookback window, inverted Transformers exhibit better utilization of the enlarged lookback window.

Adopt Efficient Attention and Training Strategy: By inverting, efficient attention mechanisms and strategy can be leveraged to reduce the complexity on high-dimensional time series.

Updates

๐Ÿšฉ News (2023.12) We received lots of valuable suggestions. A revised version (24 Pages) is now available, which includes extensive experiments, intuitive cases, in-depth analysis and further improvement of our work.

๐Ÿšฉ News (2023.10) iTransformer has been included in [Time-Series-Library] and achieve the consistent state-of-the-art in long-term time series forecasting.

๐Ÿšฉ News (2023.10) All the scripts for the above tasks in our paper are available in this repo.

Introduction

๐ŸŒŸ Considering the characteristics of multivariate time series, iTransformer breaks the conventional model structure without the burden of modifying any Transformer modules. Inverted Transformer is all you need in MTSF.

๐Ÿ† iTransformer achieves the comprehensive state-of-the-art in challenging multivariate forecasting tasks and solves several pain points of Transformer on extensive time series data.

๐Ÿ˜Š iTransformer is repurposed on the vanilla Transformer. We think the "passionate modification" of Transformer has got too much attention in the research area of time series. Hopefully, the mainstream work in the following can focus more on the dataset infrastructure and consider the scale-up ability of Transformer.

Overall Architecture

iTransformer regards independent time series as variate tokens to capture multivariate correlations by attention and utilize layernorm and feed-forward networks to learn series representations.

The pseudo-code of iTransformer is as simple as the following:

Usage

  1. Install Pytorch and necessary dependencies.
pip install -r requirements.txt
  1. The datasets can be obtained from Google Drive or Tsinghua Cloud.

  2. Train and evaluate the model. We provide all the above tasks under the folder ./scripts/. You can reproduce the results as the following examples:

# Multivariate forecasting with iTransformer
bash ./scripts/multivariate_forecasting/Traffic/iTransformer.sh

# Compare the performance of Transformer and iTransformer
bash ./scripts/boost_performance/Weather/iTransformer.sh

# Train the model with partial variates, and generalize on the unseen variates
bash ./scripts/variate_generalization/Electricity/iTransformer.sh

# Test the performance on the enlarged lookback window
bash ./scripts/increasing_lookback/Traffic/iTransformer.sh

# Utilize FlashAttention for acceleration
bash ./scripts/efficient_attentions/iFlashTransformer.sh

Main Result of Multivariate Forecasting

We evaluate the iTransformer on extensive challenging multivariate forecasting benchmarks as well as the server load prediction of Alipay online transactions (generally hundreds of variates, denoted as Dim). Comprehensive good performance (MSE/MAE) is achieved by iTransformer. iTransformer is particularly good at forecasting high-dimensional time series.

Challenging Multivariate Time Series Forecasting Benchmarks (Avg Results)

Online Transaction Load Prediction of Alipay Trading Platform (Avg Results)

General Performance Boosting on Transformers

By introducing the proposed framework, Transformer and its variants achieve significant performance improvement, demonstrating the generality of the iTransformer approach and benefiting from efficient attention mechanisms.

Generalization on Unseen Variates

Technically, iTransformer can forecast with arbitrary numbers of variables during inference. We partition the variates of each dataset into five folders, train models with 20% variates, and use the partially trained model to forecast all varieties. iTransformers can be trained efficiently and forecast unseen variates with good generalizability.

Better Utilization of Lookback Windows

While previous Transformers do not necessarily benefit from the increase of historical observation. iTransformers show a surprising improvement in forecasting performance with the increasing length of the lookback window.

Model Analysis

Benefiting from inverted Transformer modules:

  • (Left) Inverted Transformers learn better time series representations (more similar CKA) favored by time series forecasting.
  • (Right) The inverted self-attention module learns interpretable multivariate correlations.

  • Visualization of the variates from Market and the learned multivariate correlations. Each variate represents the monitored interface values of an application, and the applications can be further grouped into refined categories.

Model Abalations

iTransformer that utilizes attention on variate dimensions and feed-forward on temporal dimension generally achieves the best performance. However, the performance of vanilla Transformer (the third row) performs the worst among these designs, indicating the disaccord of responsibility when the conventional architecture is adopted.

Model Efficiency

We propose a training strategy for multivariate series by taking advantage of its variate generation ability. While the performance (Left) remains stable on partially trained variates of each batch with the sampled ratios, the memory footprint (Right) of the training process can be cut off significantly.

Citation

If you find this repo helpful, please cite our paper.

@article{liu2023itransformer,
  title={iTransformer: Inverted Transformers Are Effective for Time Series Forecasting},
  author={Liu, Yong and Hu, Tengge and Zhang, Haoran and Wu, Haixu and Wang, Shiyu and Ma, Lintao and Long, Mingsheng},
  journal={arXiv preprint arXiv:2310.06625},
  year={2023}
}

Future Work

  • iTransformer for other time series tasks.
  • Integrating Transformer variants.
  • iTransformer Scalability.

Acknowledgement

We appreciate the following GitHub repos a lot for their valuable code and efforts.

Contact

If you have any questions or want to use the code, feel free to contact:

More Repositories

1

Time-Series-Library

A Library for Advanced Deep Time Series Models.
Python
6,099
star
2

Transfer-Learning-Library

Transfer Learning Library for Domain Adaptation, Task Adaptation, and Domain Generalization
Python
3,318
star
3

Autoformer

About Code release for "Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting" (NeurIPS 2021), https://arxiv.org/abs/2106.13008
Jupyter Notebook
1,882
star
4

Anomaly-Transformer

About Code release for "Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight), https://openreview.net/forum?id=LzQQ89U1qm_
Python
708
star
5

TimesNet

About Code release for "TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis" (ICLR 2023), https://openreview.net/pdf?id=ju_Uqw384Oq
673
star
6

awesome-multi-task-learning

2024 up-to-date list of DATASETS, CODEBASES and PAPERS on Multi-Task Learning (MTL), from Machine Learning perspective.
625
star
7

Xlearn

Transfer Learning Library
Jupyter Notebook
459
star
8

Nonstationary_Transformers

Code release for "Non-stationary Transformers: Exploring the Stationarity in Time Series Forecasting" (NeurIPS 2022), https://arxiv.org/abs/2205.14415
Python
455
star
9

predrnn-pytorch

Official implementation for NIPS'17 paper: PredRNN: Recurrent Neural Networks for Predictive Learning Using Spatiotemporal LSTMs.
Python
438
star
10

depyf

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.
Python
407
star
11

CDAN

Code release for "Conditional Adversarial Domain Adaptation" (NIPS 2018)
Jupyter Notebook
392
star
12

Flowformer

About Code release for "Flowformer: Linearizing Transformers with Conservation Flows" (ICML 2022), https://arxiv.org/pdf/2202.06258.pdf
Python
300
star
13

Universal-Domain-Adaptation

Code release for Universal Domain Adaptation(CVPR 2019)
Python
272
star
14

HashNet

Code release for "HashNet: Deep Learning to Hash by Continuation" (ICCV 2017)
Jupyter Notebook
240
star
15

Large-Time-Series-Model

Official code, datasets and checkpoints for "Timer: Generative Pre-trained Transformers Are Large Time Series Models" (ICML 2024)
Python
214
star
16

LogME

Code release for "LogME: Practical Assessment of Pre-trained Models for Transfer Learning" (ICML 2021) and Ranking and Tuning Pre-trained Models: A New Paradigm for Exploiting Model Hubs (JMLR 2022)
Python
200
star
17

Koopa

Code release for "Koopa: Learning Non-stationary Time Series Dynamics with Koopman Predictors" (NeurIPS 2023), https://arxiv.org/abs/2305.18803
Python
173
star
18

Corrformer

About code release of "Interpretable Weather Forecasting for Worldwide Stations with a Unified Deep Model", Nature Machine Intelligence, 2023. https://www.nature.com/articles/s42256-023-00667-9
Python
155
star
19

A-Roadmap-for-Transfer-Learning

151
star
20

MDD

Code released for ICML 2019 paper "Bridging Theory and Algorithm for Domain Adaptation".
Python
129
star
21

Self-Tuning

Code release for "Self-Tuning for Data-Efficient Deep Learning" (ICML 2021)
Python
109
star
22

SimMTM

About Code release for "SimMTM: A Simple Pre-Training Framework for Masked Time-Series Modeling" (NeurIPS 2023 Spotlight), https://arxiv.org/abs/2302.00861
Python
103
star
23

PADA

Code release for "Partial Adversarial Domain Adaptation" (ECCV 2018)
Python
100
star
24

Batch-Spectral-Penalization

Code release for Transferability vs. Discriminability: Batch Spectral Penalization for Adversarial Domain Adaptation (ICML 2019)
Python
91
star
25

Transferable-Adversarial-Training

Code release for Transferable Adversarial Training: A General Approach to Adapting Deep Classi๏ฌers (ICML2019)
Python
80
star
26

TransNorm

Code release for "Transferable Normalization: Towards Improving Transferability of Deep Neural Networks" (NeurIPS 2019)
Python
78
star
27

MTlearn

Code release for "Learning Multiple Tasks with Multilinear Relationship Networks" (NIPS 2017)
Python
70
star
28

SAN

Code release for "Partial Transfer Learning with Selective Adversarial Networks" (CVPR 2018)
Jupyter Notebook
69
star
29

Domain-Adaptation-Regression

Code release for Representation Subspace Distance for Domain Adaptation Regression (ICML 2021)
Python
69
star
30

HashGAN

HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN
Python
68
star
31

Deep-Embedded-Validation

Code release for Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation (ICML 2019)
Python
61
star
32

Latent-Spectral-Models

About Code Release for "Solving High-Dimensional PDEs with Latent Spectral Models" (ICML 2023), https://arxiv.org/abs/2301.12664
Python
59
star
33

CLIPood

About Code Release for "CLIPood: Generalizing CLIP to Out-of-Distributions" (ICML 2023), https://arxiv.org/abs/2302.00864
Python
58
star
34

iVideoGPT

Official repo for "iVideoGPT: Interactive VideoGPTs are Scalable World Models", https://arxiv.org/abs/2405.15223
Python
58
star
35

Transolver

About code release of "Transolver: A Fast Transformer Solver for PDEs on General Geometries", ICML 2024 Spotlight. https://arxiv.org/abs/2402.02366
Python
57
star
36

MADA

Code release for "Multi-Adversarial Domain Adaptation" (AAAI 2018)
C++
56
star
37

MotionRNN

About Code release for "MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions" (CVPR 2021) https://arxiv.org/abs/2103.02243
Python
50
star
38

ETN

Code released for CVPR 2019 paper "Learning to Transfer Examples for Partial Domain Adaptation"
Python
50
star
39

Debiased-Self-Training

Code release of paper Debiased Self-Training for Semi-Supervised Learning (NeurIPS 2022 Oral)
50
star
40

Versatile-Domain-Adaptation

Code Release for "Minimum Class Confusion for Versatile Domain Adaptation"(ECCV2020)
Python
50
star
41

ContextWM

Code release for "Pre-training Contextualized World Models with In-the-wild Videos for Reinforcement Learning" (NeurIPS 2023), https://arxiv.org/abs/2305.18499
Python
50
star
42

Separate_to_Adapt

Code release for Separate to Adapt: Open Set Domain Adaptation via Progressive Separation (CVPR 2019)
Python
49
star
43

AutoTimes

Official implementation for "AutoTimes: Autoregressive Time Series Forecasters via Large Language Models"
Python
45
star
44

CoTuning

Code release for NeurIPS 2020 paper "Co-Tuning for Transfer Learning"
Python
39
star
45

OpenDG-DAML

Code release for Open Domain Generalization with Domain-Augmented Meta-Learning (CVPR2021)
Python
32
star
46

Calibrated-Multiple-Uncertainties

Code Release for "Learning to Detect Open Classes for Universal Domain Adaptation"(ECCV2020)
Python
30
star
47

TimeSiam

Python
25
star
48

Batch-Spectral-Shrinkage

Code release for Catastrophic Forgetting Meets Negative Transfer: Batch Spectral Shrinkage for Safe Transfer Learning (NeurIPS 2019)
Python
24
star
49

StochNorm

Code release for NeurIPS 2020 paper "Stochastic Normalization"
Python
23
star
50

Transferable-Query-Selection

Code Release for "Transferable Query Selection for Active Domain Adaptation"(CVPR2021)
Python
23
star
51

Decoupled-Adaptation-for-Cross-Domain-Object-Detection

Code for ICLR2022 Decoupled Adaptation for Cross-Domain Object Detection (D-adapt) https://arxiv.org/abs/2110.02578
22
star
52

few-shot

A lightweight library that implements state-of-the-art few-shot learning algorithms.
Python
21
star
53

HarmonyDream

Code release for "HarmonyDream: Task Harmonization Inside World Models" (ICML 2024), https://arxiv.org/abs/2310.00344
Python
21
star
54

transferable-memory

Python
20
star
55

VideoDG

Python
20
star
56

TCL

Code release for Transferable Curriculum for Weakly-Supervised Domain Adaptation (AAAI2019)
Python
18
star
57

SPOT

Code release for "Supported Policy Optimization for Offline Reinforcement Learning" (NeurIPS 2022), https://arxiv.org/abs/2202.06239
Python
18
star
58

DPH

Code release for "Deep Priority Hashing" (ACMMM 2018)
C++
18
star
59

MMHH

Python
15
star
60

Metasets

Python
15
star
61

PAN

Python
15
star
62

DCN

Deep Calibration Network
Python
15
star
63

ModeRNN

Python
14
star
64

ForkMerge

Code release of paper "ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning" (NeurIPS 2023)
14
star
65

TAH

Code release for "Transfer Adversarial Hashing for Hamming Space Retrieval" (AAAI 2018)
C++
13
star
66

TransCal

Python
12
star
67

learn_torch.compile

torch.compile artifacts for common deep learning models, can be used as a learning resource for torch.compile
Python
12
star
68

HelmFluid

About code release of "HelmFluid: Learning Helmholtz Dynamics for Interpretable Fluid Prediction", ICML 2024. https://arxiv.org/pdf/2310.10565
Python
11
star
69

Multi-Embedding

About Code Release for "On the Embedding Collapse When Scaling Up Recommendation Models" (ICML 2024)
Python
11
star
70

Zoo-Tuning

Code release for Zoo-Tuning: Adaptive Transfer from A Zoo of Models (ICML2021)
Python
7
star
71

timer

See the official code and checkpoints for "Timer: Generative Pre-trained Transformers Are Large Time Series Models"
HTML
5
star
72

Regressive-Domain-Adaptation-for-Unsupervised-Keypoint-Detection

Code for CVPR 2021 Regressive Domain Adaptation for Unsupervised Keypoint Detection (RegDA) https://arxiv.org/abs/2103.06175
5
star
73

MitNet

About Code Release for "Estimating Heterogeneous Treatment Effects: Mutual Information Bounds and Learning Algorithms" (ICML 2023)
Python
4
star
74

MobileAttention

Official implementation of "Mobile Attention: Mobile-Friendly Linear-Attention for Vision Transformers in PyTorch". To run the code, you can refer to https://github.com/thuml/Flowformer.
Python
1
star