• Stars
    star
    112
  • Rank 312,240 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created about 3 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Implementation for Learning to Track with Object Permanence

Learning to Track with Object Permanence

A video-based MOT approach capable of tracking through full occlusions:

Learning to Track with Object Permanence,
Pavel Tokmakov, Jie Li, Wolfram Burgard, Adrien Gaidon,
arXiv technical report (arXiv 2103.14258)

@inproceedings{tokmakov2021learning,
  title={Learning to Track with Object Permanence},
  author={Tokmakov, Pavel and Li, Jie and Burgard, Wolfram and Gaidon, Adrien},
  booktitle={ICCV},
  year={2021}
}

Check out our self-supervised extension publised at ICML'22:

Object Permanence Emerges in a Random Walk along Memory,
Pavel Tokmakov, Allan Jabri, Jie Li, Adrien Gaidon,
arXiv technical report (arXiv 2204.01784)

@inproceedings{tokmakov2022object,
  title={Object Permanence Emerges in a Random Walk along Memory},
  author={Tokmakov, Pavel and Jabri, Allan and Li, Jie and Gaidon, Adrien},
  booktitle={ICML},
  year={2022}
}

Abstract

Tracking by detection, the dominant approach for online multi-object tracking, alternates between localization and association steps. As a result, it strongly depends on the quality of instantaneous observations, often failing when objects are not fully visible. In contrast, tracking in humans is underlined by the notion of object permanence: once an object is recognized, we are aware of its physical existence and can approximately localize it even under full occlusions. In this work, we introduce an end-to-end trainable approach for joint object detection and tracking that is capable of such reasoning. We build on top of the recent CenterTrack architecture, which takes pairs of frames as input, and extend it to videos of arbitrary length. To this end, we augment the model with a spatio-temporal, recurrent memory module, allowing it to reason about object locations and identities in the current frame using all the previous history. It is, however, not obvious how to train such an approach. We study this question on a new, large-scale, synthetic dataset for multi-object tracking, which provides ground truth annotations for invisible objects, and propose several approaches for supervising tracking behind occlusions. Our model, trained jointly on synthetic and real data, outperforms the state of the art on KITTI and MOT17 datasets thanks to its robustness to occlusions.

Installation

Please refer to INSTALL.md for installation instructions.

Benchmark Evaluation and Training

After installation, follow the instructions in DATA.md to setup the datasets. Then check GETTING_STARTED.md to reproduce the results in the paper. We provide scripts for all the experiments in the experiments folder.

License

PermaTrack is developed upon CenterTrack. Both codebases are released under MIT License themselves. Some code of CenterTrack are from third-parties with different licenses, please check the CenterTrack repo for details. In addition, this repo uses py-motmetrics for MOT evaluation, nuscenes-devkit for nuScenes evaluation and preprocessing, and TAO codebase for computing Track AP. ConvGRU implementation is adopted from this repo. See NOTICE for detail. Please note the licenses of each dataset. Most of the datasets we used in this project are under non-commercial licenses.

More Repositories

1

packnet-sfm

TRI-ML Monocular Depth Estimation Repository
Python
1,243
star
2

vidar

Python
560
star
3

DDAD

Dense Depth for Autonomous Driving (DDAD) dataset.
Python
490
star
4

dd3d

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.
Python
464
star
5

prismatic-vlms

A flexible and efficient codebase for training visually-conditioned language models (VLMs)
Python
445
star
6

KP3D

Code for "Self-Supervised 3D Keypoint Learning for Ego-motion Estimation"
Python
240
star
7

PF-Track

Implementation of PF-Track
Python
203
star
8

KP2D

Python
176
star
9

sdflabel

Official PyTorch implementation of CVPR 2020 oral "Autolabeling 3D Objects With Differentiable Rendering of SDF Shape Priors"
Python
161
star
10

realtime_panoptic

Official PyTorch implementation of CVPR 2020 Oral: Real-Time Panoptic Segmentation from Dense Detections
Python
115
star
11

camviz

Visualization Library
Python
101
star
12

dgp

ML Dataset Governance Policy for Autonomous Vehicle Datasets
Python
94
star
13

VEDet

Python
39
star
14

RAP

This is the official code for the paper RAP: Risk-Aware Prediction for Robust Planning: https://arxiv.org/abs/2210.01368
Python
34
star
15

VOST

Code for the VOST dataset
Python
22
star
16

RAM

Implementation for Object Permanence Emerges in a Random Walk along Memory
Python
18
star
17

road

ROAD: Learning an Implicit Recursive Octree Auto-Decoder to Efficiently Encode 3D Shapes (CoRL 2022)
Python
11
star
18

efm_datasets

TRI-ML Embodied Foundation Datasets
Python
8
star
19

OctMAE

Zero-Shot Multi-Object Shape Completion (ECCV 2024)
Python
5
star
20

refine

Official PyTorch implementation of the SIGGRAPH 2024 paper "ReFiNe: Recursive Field Networks for Cross-Modal Multi-Scene Representation"
Python
5
star
21

stochastic_verification

Official repository for the paper "How Generalizable Is My Behavior Cloning Policy? A Statistical Approach to Trustworthy Performance Evaluation"
Python
5
star
22

HAICU

4
star
23

binomial_cis

Computation of binomial confidence intervals that achieve exact coverage.
Jupyter Notebook
4
star
24

vlm-evaluation

VLM Evaluation: Benchmark for VLMs, spanning text generation tasks from VQA to Captioning
Python
1
star