• Stars
    star
    645
  • Rank 69,781 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 3 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ICCV'21] Learning Spatio-Temporal Transformer for Visual Tracking

STARK

The official implementation of the ICCV2021 paper Learning Spatio-Temporal Transformer for Visual Tracking

Hiring research interns for visual transformer projects: [email protected]

News

  • STARK has been integrated into the mmtracking library!
  • πŸ† We are the Winner of VOT-21 RGB-D challenge
  • πŸ† We won the Runner-ups in VOT-21 Real-Time and Long-term challenges
  • We release an extremely fast version of STARK called STARK-Lightning ⚑ . It can run at 200~300 FPS on a RTX TITAN GPU. Besides, its performance can beat DiMP50, while the model size is even less than that of SiamFC! More details can be found at STARK_Lightning_En.md/中文教程
  • The raw results of STARK and other trackers on NOTU (NFS, OTB100, TC128, UAV123) have been uploaded to here STARK_Framework

Highlights

End-to-End, Post-processing Free

STARK is an end-to-end tracking approach, which directly predicts one accurate bounding box as the tracking result.
Besides, STARK does not use any hyperparameters-sensitive post-processing, leading to stable performances.

Real-Time Speed

STARK-ST50 and STARK-ST101 run at 40FPS and 30FPS respectively on a Tesla V100 GPU.

Strong performance

Tracker LaSOT (AUC) GOT-10K (AO) TrackingNet (AUC)
STARK 67.1 68.8 82.0
TransT 64.9 67.1 81.4
TrDiMP 63.7 67.1 78.4
Siam R-CNN 64.8 64.9 81.2

Purely PyTorch-based Code

STARK is implemented purely based on the PyTorch.

Install the environment

Option1: Use the Anaconda

conda create -n stark python=3.6
conda activate stark
bash install_pytorch17.sh

Option2: Use the docker file

We provide the complete docker at here

Data Preparation

Put the tracking datasets in ./data. It should look like:

${STARK_ROOT}
 -- data
     -- lasot
         |-- airplane
         |-- basketball
         |-- bear
         ...
     -- got10k
         |-- test
         |-- train
         |-- val
     -- coco
         |-- annotations
         |-- images
     -- trackingnet
         |-- TRAIN_0
         |-- TRAIN_1
         ...
         |-- TRAIN_11
         |-- TEST

Set project paths

Run the following command to set paths for this project

python tracking/create_default_local_file.py --workspace_dir . --data_dir ./data --save_dir .

After running this command, you can also modify paths by editing these two files

lib/train/admin/local.py  # paths about training
lib/test/evaluation/local.py  # paths about testing

Train STARK

Training with multiple GPUs using DDP

# STARK-S50
python tracking/train.py --script stark_s --config baseline --save_dir . --mode multiple --nproc_per_node 8  # STARK-S50
# STARK-ST50
python tracking/train.py --script stark_st1 --config baseline --save_dir . --mode multiple --nproc_per_node 8  # STARK-ST50 Stage1
python tracking/train.py --script stark_st2 --config baseline --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline  # STARK-ST50 Stage2
# STARK-ST101
python tracking/train.py --script stark_st1 --config baseline_R101 --save_dir . --mode multiple --nproc_per_node 8  # STARK-ST101 Stage1
python tracking/train.py --script stark_st2 --config baseline_R101 --save_dir . --mode multiple --nproc_per_node 8 --script_prv stark_st1 --config_prv baseline_R101  # STARK-ST101 Stage2

(Optionally) Debugging training with a single GPU

python tracking/train.py --script stark_s --config baseline --save_dir . --mode single

Test and evaluate STARK on benchmarks

  • LaSOT
python tracking/test.py stark_st baseline --dataset lasot --threads 32
python tracking/analysis_results.py # need to modify tracker configs and names
  • GOT10K-test
python tracking/test.py stark_st baseline_got10k_only --dataset got10k_test --threads 32
python lib/test/utils/transform_got10k.py --tracker_name stark_st --cfg_name baseline_got10k_only
  • TrackingNet
python tracking/test.py stark_st baseline --dataset trackingnet --threads 32
python lib/test/utils/transform_trackingnet.py --tracker_name stark_st --cfg_name baseline
  • VOT2020
    Before evaluating "STARK+AR" on VOT2020, please install some extra packages following external/AR/README.md
cd external/vot20/<workspace_dir>
export PYTHONPATH=<path to the stark project>:$PYTHONPATH
bash exp.sh
  • VOT2020-LT
cd external/vot20_lt/<workspace_dir>
export PYTHONPATH=<path to the stark project>:$PYTHONPATH
bash exp.sh

Test FLOPs, Params, and Speed

# Profiling STARK-S50 model
python tracking/profile_model.py --script stark_s --config baseline
# Profiling STARK-ST50 model
python tracking/profile_model.py --script stark_st2 --config baseline
# Profiling STARK-ST101 model
python tracking/profile_model.py --script stark_st2 --config baseline_R101
# Profiling STARK-Lightning-X-trt
python tracking/profile_model_lightning_X_trt.py

Model Zoo

The trained models, the training logs, and the raw tracking results are provided in the model zoo

Acknowledgments

More Repositories

1

TTSR

[CVPR'20] TTSR: Learning Texture Transformer Network for Image Super-Resolution
Python
765
star
2

SiamDW

[CVPR'19 Oral] Deeper and Wider Siamese Networks for Real-Time Visual Tracking
Python
750
star
3

TracKit

[ECCV'20] Ocean: Object-aware Anchor-Free Tracking
Python
612
star
4

STTN

[ECCV'2020] STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting
Jupyter Notebook
465
star
5

AOT-GAN-for-Inpainting

[TVCG'2023] AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)
Python
424
star
6

LightTrack

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search
Python
396
star
7

MM-Diffusion

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation
Python
389
star
8

PEN-Net-for-Inpainting

[CVPR'2019] PEN-Net: Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting
Python
357
star
9

img2poem

[MM'18] Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training
Python
280
star
10

tasn

Trilinear Attention Sampling Network for Fine-grained Image Recognition
Python
218
star
11

soho

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning
Python
206
star
12

TTVSR

[CVPR'22 Oral] TTVSR: Learning Trajectory-Aware Transformer for Video Super-Resolution
Python
199
star
13

FTVSR

[ECCV'22] FTVSR: Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution
Python
154
star
14

DBTNet

Code for our NeurIPS'19 paper "Learning Deep Bilinear Transformation for Fine-grained Image Representation"
Python
105
star
15

generate-it

A collection of models for image<->text generation in ACM MM 2021.
Python
64
star
16

CKDN

[ICCV'21] CKDN: Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment
Python
55
star
17

SariGAN

[NeurIPS'20] Learning Semantic-aware Normalization for Generative Adversarial Networks
Python
53
star
18

VOT2019

The Winner and Runner-up Trackers for VOT-2019 Challenges
Python
51
star
19

WSOD2

[ICCV'19] WSOD^2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection
Python
47
star
20

VQD-SR

[ICCV'23] VQD-SR: Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution
Python
37
star
21

CyDAS

Cyclic Differentiable Architecture Search
Python
34
star
22

NEAS

Python
19
star
23

2D-TAN

AAAI2020 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language
Python
17
star
24

STTR

[ACCV'22] Fine-Grained Image Style Transfer with Visual Transformers
Python
14
star
25

AAST-pytorch

[MM'20] Aesthetic-Aware Image Style Transfer
Python
14
star
26

davinci-videofactory

JavaScript
12
star
27

AI_Illustrator

[MM'22 Oral] AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation
Python
11
star
28

language-guided-animation

[TMM 2023] Language-Guided Face Animation by Recurrent StyleGAN-based Generator
Python
11
star