• Stars
    star
    362
  • Rank 117,671 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 3 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

PWC

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation

This is the offical implementation of paper PCAN for MOTS. PCAN is also served as the baseline method in BDD100K tracking challenges at CVPR 2022.

We also present a trailer that consists of method illustrations and tracking & segmentation visualizations. Our project website contains more information: vis.xyz/pub/pcan.

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
NeurIPS 2021, Spotlight
Lei Ke, Xia Li, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

Abstract

Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes. Most approaches only exploit the temporal dimension to address the association problem, while relying on single frame predictions for the segmentation mask itself. We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation. PCAN first distills a space-time memory into a set of prototypes and then employs cross-attention to retrieve rich information from the past frames. To segment each object, PCAN adopts a prototypical appearance module to learn a set of contrastive foreground and background prototypes, which are then propagated over time. Extensive experiments demonstrate that PCAN outperforms current video instance tracking and segmentation competition winners on both Youtube-VIS and BDD100K datasets, and shows efficacy to both one-stage and two-stage segmentation frameworks.

Prototypical Cross-Attention Networks (PCAN)

Main results

Results on BDD100K Benchmark

Detector mMOTSA-val mIDF1-val ID Sw.-val Scores-val mMOTSA-test mIDF1-test ID Sw.-test Scores-test Config Weights Preds Visuals
ResNet-50 28.1 45.4 874 scores 31.9 50.4 845 scores config model | MD5 preds visuals

Installation

Please refer to INSTALL.md for installation instructions.

Usages

Please refer to GET_STARTED.md for dataset preparation and detailed running (training, testing, visualization, etc.) instructions.

Related links

Youtube Video | Bilibili Video| Zhihu Reading

Citation

If you find PCAN useful in your research or refer to the provided baseline results, please star this repository and consider citing 📝:

@inproceedings{pcan,
  title={Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation},
  author={Ke, Lei and Li, Xia and Danelljan, Martin and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
  booktitle={Advances in Neural Information Processing Systems},
  year={2021}
}

More Repositories

1

sam-hq

Segment Anything in High Quality [NeurIPS 2023]
Python
3,689
star
2

sam-pt

SAM-PT: Extending SAM to zero-shot video segmentation with point-based tracking.
Python
970
star
3

transfiner

Mask Transfiner for High-Quality Instance Segmentation, CVPR 2022
Python
525
star
4

qd-3dt

Official implementation of Monocular Quasi-Dense 3D Object Tracking, TPAMI 2022
Python
515
star
5

qdtrack

Quasi-Dense Similarity Learning for Multiple Object Tracking, CVPR 2021 (Oral)
Python
382
star
6

MaskFreeVIS

Mask-Free Video Instance Segmentation [CVPR 2023]
Python
358
star
7

bdd100k-models

Model Zoo of BDD100K Dataset
Python
285
star
8

idisc

iDisc: Internal Discretization for Monocular Depth Estimation [CVPR 2023]
Python
279
star
9

LiDAR_snow_sim

LiDAR snowfall simulation
Python
172
star
10

r3d3

Python
144
star
11

P3Depth

Python
123
star
12

shift-dev

SHIFT Dataset DevKit - CVPR2022
Python
103
star
13

cascade-detr

[ICCV'23] Cascade-DETR: Delving into High-Quality Universal Object Detection
Python
92
star
14

tet

Implementation of Tracking Every Thing in the Wild, ECCV 2022
Python
69
star
15

TrafficBots

TrafficBots: Towards World Models for Autonomous Driving Simulation and Motion Prediction. ICRA 2023. Code is now available at https://github.com/zhejz/TrafficBots
51
star
16

nutsh

A Platform for Visual Learning from Human Feedback
TypeScript
42
star
17

vmt

Video Mask Transfiner for High-Quality Video Instance Segmentation (ECCV'2022)
Jupyter Notebook
29
star
18

spc2

Instance-Aware Predictive Navigation in Multi-Agent Environments, ICRA 2021
Python
20
star
19

CISS

Unsupervised condition-level adaptation for semantic segmentation
Python
20
star
20

shift-detection-tta

This repository implements continuous test-time adaptation algorithms for object detection on the SHIFT dataset.
Python
18
star
21

vis4d

A modular library for visual 4D scene understanding
Python
17
star
22

dla-afa

Official implementation of Dense Prediction with Attentive Feature Aggregation, WACV 2023
Python
12
star
23

soccer-player

Python
8
star
24

project-template

Python
4
star
25

vis4d_cuda_ops

Cuda
3
star
26

vis4d-template

Vis4D Template.
Shell
3
star