researchmm/LightTrack

Stars
396
Rank 108,801 (Top 3 %)
Language
Python
License
MIT License
Created over 3 years ago
Updated almost 3 years ago

researchmm/LightTrack

researchmm

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

[CVPR21] LightTrack: Finding Lightweight Neural Network for Object Tracking via One-Shot Architecture Search

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

The official implementation of the paper

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

Hiring research interns for visual transformer projects: [email protected]

News

We have uploaded the pre-trained weights of the SuperNets(for both ImageNet classification and object tracking) to Google Drive. Users can use them as initialization for future research on efficient object tracking.

Abstract

We present LightTrack, which uses neural architecture search (NAS) to design more lightweight and efficient object trackers. Comprehensive experiments show that our LightTrack is effective. It can find trackers that achieve superior performance compared to handcrafted SOTA trackers, such as SiamRPN++ and Ocean, while using much fewer model Flops and parameters. Moreover, when deployed on resource-constrained mobile chipsets, the discovered trackers run much faster. For example, on Snapdragon 845 Adreno GPU, LightTrack runs 12× faster than Ocean, while using 13× fewer parameters and 38× fewer Flops. Such improvements might narrow the gap between academic models and industrial deployments in object tracking task.

Environment Installation

cd lighttrack
conda create -n lighttrack python=3.6
conda activate lighttrack
bash install.sh

Data Preparation

Tracking Benchmarks

Please put VOT2019 dataset under $LightTrack/dataset. The prepared data should look like:

$LighTrack/dataset/VOT2019.json
$LighTrack/dataset/VOT2019/agility
$LighTrack/dataset/VOT2019/ants1
...
$LighTrack/dataset/VOT2019/list.txt

Test and evaluation

Test LightTrack-Mobile on VOT2019

bash tracking/reproduce_vot2019.sh

Flops, Params, and Speed

Compute the flops and params of our LightTrack-Mobile. The flops counter we use is pytorch-OpCounter

python tracking/FLOPs_Params.py

Test the running speed of our LightTrack-Mobile

python tracking/Speed.py

TTSR

[CVPR'20] TTSR: Learning Texture Transformer Network for Image Super-Resolution

SiamDW

[CVPR'19 Oral] Deeper and Wider Siamese Networks for Real-Time Visual Tracking

Stark

[ICCV'21] Learning Spatio-Temporal Transformer for Visual Tracking

TracKit

[ECCV'20] Ocean: Object-aware Anchor-Free Tracking

STTN

[ECCV'2020] STTN: Learning Joint Spatial-Temporal Transformations for Video Inpainting

Jupyter Notebook

AOT-GAN-for-Inpainting

[TVCG'2023] AOT-GAN for High-Resolution Image Inpainting (codebase for image inpainting)

MM-Diffusion

[CVPR'23] MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

PEN-Net-for-Inpainting

[CVPR'2019] PEN-Net: Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting

img2poem

[MM'18] Beyond Narrative Description: Generating Poetry from Images by Multi-Adversarial Training

tasn

Trilinear Attention Sampling Network for Fine-grained Image Recognition

soho

[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

TTVSR

[CVPR'22 Oral] TTVSR: Learning Trajectory-Aware Transformer for Video Super-Resolution

FTVSR

[ECCV'22] FTVSR: Learning Spatiotemporal Frequency-Transformer for Compressed Video Super-Resolution

DBTNet

Code for our NeurIPS'19 paper "Learning Deep Bilinear Transformation for Fine-grained Image Representation"

generate-it

A collection of models for image<->text generation in ACM MM 2021.

CKDN

[ICCV'21] CKDN: Learning Conditional Knowledge Distillation for Degraded-Reference Image Quality Assessment

SariGAN

[NeurIPS'20] Learning Semantic-aware Normalization for Generative Adversarial Networks

VOT2019

The Winner and Runner-up Trackers for VOT-2019 Challenges

WSOD2

[ICCV'19] WSOD^2: Learning Bottom-up and Top-down Objectness Distillation for Weakly-supervised Object Detection

VQD-SR

[ICCV'23] VQD-SR: Learning Data-Driven Vector-Quantized Degradation Model for Animation Video Super-Resolution

CyDAS

Cyclic Differentiable Architecture Search

NEAS

2D-TAN

AAAI2020 - Learning 2D Temporal Localization Networks for Moment Localization with Natural Language

STTR

[ACCV'22] Fine-Grained Image Style Transfer with Visual Transformers

AAST-pytorch

[MM'20] Aesthetic-Aware Image Style Transfer

davinci-videofactory

AI_Illustrator

[MM'22 Oral] AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation

language-guided-animation

[TMM 2023] Language-Guided Face Animation by Recurrent StyleGAN-based Generator