• Stars
    star
    455
  • Rank 96,175 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official implementation of our ICLR 2023 paper "Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting"

Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting (ICLR 2023)

This is the origin Pytorch implementation of Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting.

Key Points of Crossformer

1. Dimension-Segment-Wise (DSW) Embedding

Figure 1. DSW embedding. Left: Embedding method of previous Transformer-based model: data points in different dimensions at the same step are embedded into a vector; Right: DSW embedding of Crossformer: in each dimension, nearby points over time form a segment for embedding.

2. Two-Stage Attention (TSA) Layer

Figure 2. TSA layer. Left: Overall structure: the 2D vector array goes through the Cross-Time Stage and Cross-Dimension Stage to get corresponding dependency; Middle: Directly using MSA in Cross-Dimension Stage to build the $D$-to-$D$ connection results in $O(D^2)$ complexity. Right: Router mechanism for Cross-Dimension Stage: a small fixed number ($c$) of ``routers'' gather and distribute the information among dimensions. The complexity is reduced to $O(2cD) = O(D)$.

3. Hierarchical Encoder-Decoder (HED)

Figure 3. HED. The encoder (left) uses TSA layer and segment merging to capture dependency at different scales; the decoder (right) makes the final prediction by forecasting at each scale and adding them up.

Requirements

  • Python 3.7.10
  • numpy==1.20.3
  • pandas==1.3.2
  • torch==1.8.1
  • einops==0.4.1

Reproducibility

  1. Put datasets to conduct experiments into folder datasets/. We have already put ETTh1 and ETTm1 into it. WTH and ECL can be downloaded from https://github.com/zhouhaoyi/Informer2020. ILI and Traffic can be downloaded from https://github.com/thuml/Autoformer. Note that the WTH we used in the paper is the one with 12 dimensions from Informer, not the one with 21 dimensions from Autoformer.

  2. To get results of Crossformer with $T=168, \tau = 24, L_{seg} = 6$ on ETTh1 dataset, run:

python main_crossformer.py --data ETTh1 --in_len 168 --out_len 24 --seg_len 6 --itr 1

The model will be automatically trained and tested. The trained model will be saved in folder checkpoints/ and evaluated metrics will be saved in folder results/.

  1. You can also evaluate a trained model by running:
python eval_crossformer.py --checkpoint_root ./checkpoints --setting_name Crossformer_ETTh1_il168_ol24_sl6_win2_fa10_dm256_nh4_el3_itr0
  1. To reproduce all results in the paper, run following scripts to get corresponding results:
bash scripts/ETTh1.sh
bash scripts/ETTm1.sh
bash scripts/WTH.sh
bash scripts/ECL.sh
bash scripts/ILI.sh
bash scripts/Traffic.sh

Custom Usage

We use the AirQuality dataset to show how to train and evaluate Crossformer with your own data.

  1. Modify the AirQualityUCI.csv dataset into the following format, where the first column is date (or you can just leave the first column blank) and the other 13 columns are multivariate time series to forecast. And put the modified file into folder datasets/


Figure 4. An example of the custom dataset.

  1. This is an hourly-sampled dataset with 13 dimensions. And we are going to use the past week (168 hours) to forecast the next day (24 hour) and the segment length is set to 6. Therefore, we need to run:
python main_crossformer.py --data AirQuality --data_path AirQualityUCI.csv --data_dim 13 --in_len 168 --out_len 24 --seg_len 6
  1. We can evaluate the trained model by running:
python eval_crossformer.py --setting_name Crossformer_AirQuality_il168_ol24_sl6_win2_fa10_dm256_nh4_el3_itr0 --save_pred

The model will be evaluated, predicted and ground truth series will be saved in results/Crossformer_AirQuality_il168_ol24_sl6_win2_fa10_dm256_nh4_el3_itr0

main_crossformer is the entry point of our model and there are other parameters that can be tuned. Here we describe them in detail:

Parameter name Description of parameter
data The dataset name
root_path The root path of the data file (defaults to ./datasets/)
data_path The data file name (defaults to ETTh1.csv)
data_split Train/Val/Test split, can be ratio (e.g. 0.7,0.1,0.2) or number (e.g. 16800,2880,2880), (defaults to 0.7,0.1,0.2)
checkpoints Location to store the trained model (defaults to ./checkpoints/)
in_len Length of input/history sequence, i.e. $T$ in the paper (defaults to 96)
out_len Length of output/future sequence, i.e. $\tau$ in the paper (defaults to 24)
seg_len Length of each segment in DSW embedding, i.e. $L_{seg}$ in the paper (defaults to 6)
win_size How many adjacent segments to be merged into one in segment merging of HED (defaults to 2)
factor Number of routers in Cross-Dimension Stage of TSA, i.e. $c$ in the paper (defaults to 10)
data_dim Number of dimensions of the MTS data, i.e. $D$ in the paper (defaults to 7 for ETTh and ETTm)
d_model Dimension of hidden states, i.e. $d_{model}$ in the paper (defaults to 256)
d_ff Dimension of MLP in MSA (defaults to 512)
n_heads Num of heads in MSA (defaults to 4)
e_layers Num of encoder layers, i.e. $N$ in the paper (defaults to 3)
dropout The probability of dropout (defaults to 0.2)
num_workers The num_works of Data loader (defaults to 0)
batch_size The batch size for training and testing (defaults to 32)
train_epochs Train epochs (defaults to 20)
patience Early stopping patience (defaults to 3)
learning_rate The initial learning rate for the optimizer (defaults to 1e-4)
lradj Ways to adjust the learning rate (defaults to type1)
itr Experiments times (defaults to 1)
save_pred Whether to save the predicted results. If True, the predicted results will be saved in folder results in numpy array form. This will cost a lot time and memory for datasets with large $D$. (defaults to False).
use_gpu Whether to use gpu (defaults to True)
gpu The gpu no, used for training and inference (defaults to 0)
use_multi_gpu Whether to use multiple gpus (defaults to False)
devices Device ids of multile gpus (defaults to 0,1,2,3)

Citation

If you find this repository useful in your research, please cite:

@inproceedings{
zhang2023crossformer,
title={Crossformer: Transformer Utilizing Cross-Dimension Dependency for Multivariate Time Series Forecasting},
author={Yunhao Zhang and Junchi Yan},
booktitle={International Conference on Learning Representations},
year={2023},
}

Acknowledgement

We appreciate the following works for their valuable code and data for time series forecasting:

https://github.com/zhouhaoyi/Informer2020

https://github.com/thuml/Autoformer

https://github.com/alipay/Pyraformer

https://github.com/MAZiqing/FEDformer

The following two Vision Transformer works also inspire our DSW embedding and HED designs:

https://github.com/google-research/vision_transformer

https://github.com/microsoft/Swin-Transformer

More Repositories

1

awesome-ml4co

Awesome machine learning for combinatorial optimization papers.
Python
1,656
star
2

Bench2Drive

[NeurIPS 2024 Datasets and Benchmarks Track] Closed-Loop E2E-AD Benchmark Enhanced by World Model RL Expert
Python
1,253
star
3

Awesome-LLM4AD

A curated list of awesome LLM for Autonomous Driving resources (continually updated)
941
star
4

ThinkMatch

A research protocol for deep graph matching.
Python
830
star
5

R3Det_Tensorflow

Code for AAAI 2021 paper: R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object
Python
541
star
6

pygmtools

A Python Graph Matching Toolkit.
Python
296
star
7

EDA-AI

Implementation of NeurIPS 2021 paper "On Joint Learning for Solving Placement and Routing in Chip Design" & NeurIPS 2022 paper "The Policy-gradient Placement and Generative Routing Neural Networks for Chip Design".
Raku
201
star
8

CSL_RetinaNet_Tensorflow

Code for ECCV 2020 paper: Arbitrary-Oriented Object Detection with Circular Smooth Label
Python
190
star
9

awesome-ai4eda

Awesome Artificial Intelligence for Electronic Design Automation Papers.
139
star
10

Bench2DriveZoo

BEVFormer, UniAD, VAD in Closed-Loop CARLA Evaluation with World Model RL Expert Think2Drive
Python
132
star
11

Awesome-LLM4EDA

121
star
12

PPO-BiHyb

Implementation of our NeurIPS 2021 paper "A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs".
Python
92
star
13

awesome-molecular-docking

We would like to maintain a list of resources which aim to solve molecular docking and other closely related tasks.
90
star
14

S2TLD

Newly released traffic light dataset for small object detection.
53
star
15

T2TCO

[NeurIPS 2023] T2T: From Distribution Learning in Training to Gradient Search in Testing for Combinatorial Optimization
Python
48
star
16

LinSATNet

Official implementation of our ICML 2023 paper "LinSATNet: The Positive Linear Satisfiability Neural Networks".
Python
38
star
17

DCL_RetinaNet_Tensorflow

Code for CVPR 2021 paper: Dense Label Encoding for Boundary Discontinuity Free Rotation Detection
Python
38
star
18

NAR-CO-Solver

Official implementation non-autoregressive combinatorial optimizaiton solvers, covering our ICLR 2023 paper and SCIENTIA SINICA Informationis paper
Python
30
star
19

DP3

Deep Point Process by PyTorch
Python
28
star
20

GENN-Astar

Implementation of our CVPR 2021 paper "Combinatorial Learning of Graph Edit Distance via Dynamic Embedding".
Python
26
star
21

UP2ME

Official implementation of our ICML 2024 paper "UP2ME: Univariate Pre-training to Multivariate Fine-tuning as a General-purpose Framework for Multivariate Time Series Analysis"
Python
20
star
22

awesome-ml4tpp

Awesome machine learning for temporal point processes papers.
Python
19
star
23

ROCO

Implementation of our ICLR 2023 paper "ROCO: A General Framework for Evaluating Robustness of Combinatorial Optimization Solvers on Graphs".
C
19
star
24

WSGNN

Official PyTorch implementation for the following KDD2022 paper: Variational Inference for Training Graph Neural Networks in Low-Data Regime through Joint Structure-Label Estimation
Python
18
star
25

HardSATGEN

[SIGKDD 2023] HardSATGEN: Understanding the Difficulty of Hard SAT Formula Generation and A Strong Structure-Hardness-Aware Baseline
Python
18
star
26

GAMF

Python
16
star
27

twns

Python
16
star
28

robustMatch

Codes for CVPR 2022 Paper: "Appearance and Structure Aware Robust Deep Visual Graph Matching: Attack, Defense and Beyond"
Python
14
star
29

awesome-ai4pde

Python
11
star
30

ThinkML4CO

This repository contains open source machine learning combinatorial optimization tools, developed and maintained by Thinklab@SJTU.
10
star
31

ThinkMatch-SCGM

Python
10
star
32

GAL-VNE

Official Implementation of KDD 2023 paper: "GAL-VNE: Solving the VNE Problem with Global Reinforcement Learning and Local One-Shot Neural Prediction"
Python
9
star
33

DICE

Codes for KDD 2022 Paper: "DICE: Domain-attack Invariant Causal Learning for Improved Data Privacy Protection and Adversarial Robustness"
Python
8
star
34

RGM

Python
8
star
35

ACM-MILP

Code for our paper: ACM-MILP: Adaptive Constraint Modification via Grouping and Selection for Hardness-Preserving MILP Instance Generation
Python
8
star
36

L2P-MIP

Python
7
star
37

DPMC

Code for AAAI 2020 paper: Clustering-aware Multiple Graph Matching via Decayed Pairwise Matching Composition
7
star
38

LIG-MM

C
6
star
39

MixSATGEN

Python
5
star
40

OpenDenseLane

4
star
41

MorphGrower

Official implementation for MorphGrower (ICML2024 Oral)
Python
4
star
42

IMCPT-SparseGM-dataset

The homepage of IMCPT-SparseGM Dataset for graph matching.
Python
3
star
43

awesome-hpo

hpo update
Python
2
star
44

EasyNAS

Python
2
star
45

ThinkMatch-runtime

The runtime builder for ThinkMatch powered by Github Actions.
Dockerfile
1
star
46

ReLIZO

Python
1
star