• Stars
    star
    520
  • Rank 85,098 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

TrackR-CNN baseline method for Multi-Object Tracking and Segmentation (MOTS)

TrackR-CNN

Code for the TrackR-CNN baseline for the Multi Object Tracking and Segmentation (MOTS) task.

Project website (including annotations)

https://www.vision.rwth-aachen.de/page/mots

Paper

MOTS: Multi-Object Tracking and Segmentation

Paul Voigtlaender, Michael Krause, Aljosa Osep, Jonathon Luiten, Berin Balachandar Gnana Sekar, Andreas Geiger and Bastian Leibe

https://www.vision.rwth-aachen.de/media/papers/mots-multi-object-tracking-and-segmentation/MOTS.pdf

mots_tools for evaluating results

https://github.com/VisualComputingInstitute/mots_tools

Running this code

Setup

You'll need to install the following packages (possibly more):

tensorflow-gpu pycocotools numpy scipy sklearn pypng opencv-python munkres

In particular, the code has been tested with Python 3.6.7 and Tensorflow 1.13.1 running on a single GTX 1080 Ti gpu. While there is experimental support for multi-gpu training through the "gpus" config flag, some users have reported problems with this, so we recommend using one gpu only.

Furthermore, you'll need the KITTI MOTS dataset, where we assume you have a folder /path/to/kitti_mots with subfolders /path/to/kitti_mots/images containing the input images (i.e. there exist subfolders /path/to/kitti_mots/images/0000, /path/to/kitti_mots/images/0001, ...) and /path/to/kitti_mots/instances (again with subfolders 0000, 0001, ...) containing the annotations.

Also, create the following directories for logs, model files etc. in the base directory of the repository:

mkdir forwarded models summaries logs

Pre-Trained Models

Pre-trained models can be downloaded here: https://omnomnom.vision.rwth-aachen.de/data/trackrcnn/

Folder structure and config flags

In the configuration files, you'll need to adjust the KITTI_segtrack_data_dir and load_init flags to point to the KITTI MOTS data directory and the path to the pretrained model, respectively. Logs, checkpoints and summaries are stored in the logs/, models/ and summaries/ subdirectories.

So all in all, your folder structure should look like this:

data/
- KITTI_MOTS/
-- train/
--- images/
---- 0000/
----- 000000.png
----- 000001.png
----- ...
---- 0001/
---- ...
--- instances/
---- 0000/
----- 000000.png
----- 000001.png
----- ...
---- 0001/
---- ...
models/
- conv3d_sep2/
-- conv3d_sep2-00000005.data-00000-of-00001
-- conv3d_sep2-00000005.index
-- conv3d_sep2-00000005.meta
- converted.data-00000-of-00001
- converted.meta
- converted.index 
...
main.py

So point KITTI_segtrack_data_dir to data/KITTI_MOTS/train/ and load_init to models/converted.

Training

In order to train a model, run main.py with the corresponding configuration file. For the baseline model with two separable 3D convolutions and data association with learned embeddings, use

python main.py configs/conv3d_sep2

Forwarding and tracking

Either first train your own model as described above, or download our model and extract the files into models/conv3d_sep2/

To obtain the model's predictions (we call this "forwarding") run:

python main.py configs/conv3d_sep2 "{\"task\":\"forward_tracking\",\"dataset\":\"KITTI_segtrack_feed\",\"load_epoch_no\":5,\"batch_size\":5,\"export_detections\":true,\"do_tracking\":false,\"video_tags_to_load\":[\"0002\",\"0006\",\"0007\",\"0008\",\"0010\",\"0013\",\"0014\",\"0016\",\"0018\",\"0000\",\"0001\",\"0003\",\"0004\",\"0005\",\"0009\",\"0011\",\"0012\",\"0015\",\"0017\",\"0019\",\"0020\"]}"

The json string supplied as an additional argument here overwrites the settings in the config file. Use video_tags_to_load to obtain predictions for specific sequences (in the example, all KITTI MOTS sequences are chosen). Output is written to the forwarded/ subdirectory.

The model predictions as obtained by the previous command are not yet linked over time. You can use the following command to run the tracking algorithm described in the paper and to obtain final results in the forwarded/ subdirectory which can be processed by the mots_tools scripts:

python main.py configs/conv3d_sep2
"{\"build_networks\":false,\"import_detections\":true,\"task\":\"forward_tracking\",\"dataset\":\"KITTI_segtrack_feed\",\"do_tracking\":true,\"visualize_detections\":false,\"visualize_tracks\":false,\"load_epoch_no\":5,\"video_tags_to_load\":[\"0002\",\"0006\",\"0007\",\"0008\",\"0010\",\"0013\",\"0014\",\"0016\",\"0018\"]}"

You can also visualize the tracking results here by setting visualize_tracks to true.

Tuning

The script for random tuning will find the best combination of tracking parameters on the training set and then evaluate these parameters on the validation set. This is how the results in the MOTS paper are obtained.

To use this script, run

python scripts/eval/segtrack_tune_experiment.py /path/to/detections/ /path/to/groundtruth/ /path/to/precomputed_optical_flow/ /path/to/output_file /path/to/tmp_folder/ /path/to/mots_eval/ association_type num_iterations

where /path/to/detections/ is a folder containing the model output on the training set (obtained by the forwarding command above); /path/to/mots_eval/ refers to the official evaluation script (link see above, clone the repository and supply the path here); association_type determines the method for associating detections into tracks and is either reid (using the association embeddings - use this if unsure!), mask (using mask warping), bbox_iou (using bounding box warping with median optical flow) or bbox_center (nearest neighbor matching); num_iterations is the number of random trials (1000 in the paper); /path/to/groundtruth/ refers to the instances or instances_txt folder containing the annotations (which you can download from the project website); /path/to/precomputed_optical_flow can usually be set to a dummy folder (it refers to optical flow for input images, which is used only for a few of the ablation experiments in the paper - namely when setting association_type to mask or bbox_iou but if you set it to something else, then the flow path is ignored); at /path/to/output_file, a file will be created containing the results of the individual tuning iterations, please make sure this path is writable; at /path/to/tmp_folder a lot of intermediate folders will be stored, you can delete these afterwards.

References

Parts of this code are based on Tensorpack (https://github.com/tensorpack/tensorpack/tree/master/examples/FasterRCNN) and RETURNN (https://github.com/rwth-i6/returnn/blob/master/Log.py).

Citation

If you use this code, please cite:

@inproceedings{Voigtlaender19CVPR_MOTS,
 author = {Paul Voigtlaender and Michael Krause and Aljosa Osep and Jonathon Luiten and Berin Balachandar Gnana Sekar and Andreas Geiger and Bastian Leibe},
 title = {{MOTS}: Multi-Object Tracking and Segmentation},
 booktitle = {CVPR},
 year = {2019},
}

License

MIT License

Contact

If you find a problem in the code, please open an issue.

For general questions, please contact Paul Voigtlaender ([email protected]) or Michael Krause ([email protected])

More Repositories

1

triplet-reid

Code for reproducing the results of our "In Defense of the Triplet Loss for Person Re-Identification" paper.
Python
764
star
2

diffusion-e2e-ft

Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Python
275
star
3

mots_tools

Tools for evaluating and visualizing results for the Multi Object Tracking and Segmentation (MOTS) task
Python
222
star
4

SiamR-CNN

Siam R-CNN two-stage re-detector for visual object tracking
Python
218
star
5

2D_lidar_person_detection

Person detector for 2D range data. Code release for Self-Supervised Person Detection in 2D Range Data using a Calibrated Camera (https://arxiv.org/abs/2012.08890)
Python
161
star
6

vkitti3D-dataset

Python
100
star
7

3d-semantic-segmentation

This work is based on our paper Exploring Spatial Context for 3D Semantic Segmentation of Point Clouds, which is appeared at the IEEE International Conference on Computer Vision (ICCV) 2017, 3DRMS Workshop.
Python
98
star
8

towards-reid-tracking

Code for the paper "Towards a Principled Integration of Multi-Camera Re-Identification and Tracking through Optimal Bayes Filters"
Python
82
star
9

DR-SPAAM-Detector

DR-SPAAM: A Spatial-Attention and Auto-regressive Model for Person Detection in 2D Range Data
Python
78
star
10

ShapePriors_GCPR16

C++
47
star
11

DROW

All code related to the "DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data" paper
Jupyter Notebook
41
star
12

Person_MinkUNet

Person-MinkUNet. Winner of JRDB 3D detection challenge in JRDB-ACT Workshop at CVPR 2021. https://arxiv.org/abs/2107.06780
Python
20
star
13

BiternionNets-ROS

An implementation of BiternionNets for ROS, ready to run on a robot.
Python
13
star
14

mots_trackingonly_tools

Tools for Challenge 3: Tracking Only (MOT+KITTI) of MOTChallenge 2020
Python
8
star
15

Beacon8

A Torch-inspired library for high-level deep learning with Theano.
Python
5
star
16

RovinaSemanticSegmentation

Semantic segmentation code for the ROVINA project.
C++
4
star
17

CROWDBOT_perception

This is the perception pipeline for the CROWDBOT project, featuring person detection and tracking from multi-sensor modalities.
C
3
star
18

cityscapes-util

Utility toolbox for dealing with the CityScapes dataset.
Python
3
star
19

omni3d-rgbd

2
star
20

PARIS-sem-seg

A straight forward network for semantic segmentation in TensorFlow
Python
2
star
21

ROS-laserdumper

Dump ROS LaserScan data into csv files.
C++
1
star