STUD
This is the source code accompanying the paper Unknown-Aware Object Detection:Learning What You Donβt Know from Videos in the Wild paper by Xuefeng Du, Xin Wang, Gabriel Gozum and Yixuan Li
The codebase is based heavily from CycleConf and Detectron2.
Ads
Checkout our
- ICLR'22 work VOS on object detection in still images and classification networks.
- NeurIPS'22 work SIREN on OOD detection for detection transformers.
- ICLR'23 work NPOS on non-parametric outlier synthesis.
- NeurIPS'23 work DREAM-OOD on outlier generation in the pixel space (by diffusion models) if you are interested!
Installation
Environment
- CUDA 10.2
- Python >= 3.7
- Pytorch >= 1.6
- THe Detectron2 version matches Pytorch and CUDA versions.
Dependencies
- Create a virtual env.
python3 -m pip install --user virtualenv
python3 -m venv stud
source stud/bin/activate
- Install dependencies.
-
pip install -r requirements.txt
-
Install Pytorch 1.9
pip3 install torch torchvision
Check out the previous Pytorch versions here.
- Install Detectron2
Build Detectron2 from Source (gcc & g++ >= 5.4)
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
Or, you can install Pre-built detectron2 (example for CUDA 10.2, Pytorch 1.9)
python -m pip install detectron2 -f \ https://dl.fbaipublicfiles.com/detectron2/wheels/cu102/torch1.9/index.html
More details can be found here.
Data Preparation
BDD100K
- Download the BDD100K MOT 2020 dataset (
MOT 2020 Images
andMOT 2020 Labels
) and the detection labels (Detection 2020 Labels
) here and the detailed description is available here. Put the BDD100K data underdatasets/
in this repo. After downloading the data, the folder structure should be like below:
βββ datasets
βΒ Β βββ bdd100k
βΒ Β βΒ Β βββ images
βΒ Β βΒ Β βΒ Β βββ track
βΒ Β βΒ Β βΒ Β Β Β βββ train
βΒ Β βΒ Β βΒ Β Β Β βββ val
βΒ Β β βΒ Β Β βββ test
β β βββ labels
βΒ Β βΒ Β Β Β βββ box_track_20
βΒ Β βΒ Β Β Β βΒ Β βββ train
βΒ Β βΒ Β β βββ val
β β Β Β βββ det_20
βΒ Β βΒ Β Β Β Β Β βββ det_train.json
βΒ Β βΒ Β βββ det_val.json
βΒ Β βββ waymo
Convert the labels of the MOT 2020 data (train & val sets) into COCO format by running:
python3 datasets/bdd100k2coco.py -i datasets/bdd100k/labels/box_track_20/val/ -o datasets/bdd100k/labels/track/bdd100k_mot_val_coco.json -m track
python3 datasets/bdd100k2coco.py -i datasets/bdd100k/labels/box_track_20/train/ -o datasets/bdd100k/labels/track/bdd100k_mot_train_coco.json -m track
COCO
Download COCO2017 dataset from the official website.
Download the OOD dataset (json file) when the in-distribution dataset is Youtube-VIS from here.
Download the OOD dataset (json file) when the in-distribution dataset is BDD100k from here.
Put the two processed OOD json files to ./anntoations
The COCO dataset folder should have the following structure:
βββ datasets
βββ coco2017
βββ annotations
βββ xxx (the original json files)
βββ instances_val2017_ood_wrt_bdd.json
βββ instances_val2017_ood_wrt_vis.json
βββ train2017
βββ val2017
Youtube-VIS
Download the dataset from the official website.
Preprocess the dataset to generate the training and validation splits by running:
python datasets/convert_vis_val.py
The Youtube-VIS dataset folder should have the following structure:
βββ datasets
βββ vis
βββ train
βββ JPEGImages
βββ instances_train.json
βββ instances_val.json
nuImages
Download the dataset from the official website.
Convert the dataset by running:
python datasets/convert_nu.py
python datasets/convert_nu_ood.py
The nuImages dataset folder should have the following structure:
βββ datasets
βββ nuscence
βββ v1.0-mini
βββ v1.0-test
βββ v1.0-val
βββ v1.0-train
βββ samples
βββ semantic_masks
βββ calibrated
βββ nuimages_v1.0-val.json
βββ nu_ood.json
Before training, modify the dataset address in the ./src/data/builtin.py according to your local dataset address.
Training
Vanilla with BDD100K as the in-distribution dataset
python -m tools.train_net --config-file ./configs/BDD100k/R50_FPN_all.yaml --num-gpus 4
Vanilla with Youtube-VIS as the in-distribution dataset
python -m tools.train_net --config-file ./configs/VIS/R50_FPN_all.yaml --num-gpus 4
STUD on ResNet (BDD as ID data)
python -m tools.train_net --config-file ./configs/BDD100k/stud_resnet.yaml --num-gpus 4
STUD on RegNet (BDD as ID data)
python -m tools.train_net --config-file ./configs/BDD100k/stud_regnet.yaml --num-gpus 4
Download the pretrained backbone for RegNetX from here.
Pretrained models
The pretrained models for BDD100K can be downloaded from vanilla and STUD-ResNet and STUD-RegNet.
The pretrained models for Youtube-VIS can be downloaded from vanilla and STUD-ResNet and STUD-RegNet.
Evaluation
Evalutation with the in-distribution dataset to be BDD100K
Firstly run on the in-distribution dataset:
python -m tools.train_net --config-file ./configs/BDD100k/stud_resnet.yaml --num-gpus 4 --eval-only MODEL.WEIGHTS address/model_final.pth
where "address" is specified in the corresponding yaml file.
Then run on the OOD dataset (COCO):
python -m tools.train_net --config-file ./configs/BDD100k/stud_resnet_ood_coco.yaml --num-gpus 4 --eval-only MODEL.WEIGHTS address/model_final.pth
Obtain the metrics using:
python bdd_coco.py --energy 1 --model xxx
Here "--model" means the name of the directory that contains the checkpoint file. Evaluation on nuImages is similar.
Citation
If you found any part of this code is useful in your research, please consider citing our paper:
@article{du2022stud,
title={Unknown-Aware Object Detection: Learning What You Donβt Know from Videos in the Wild},
author={Du, Xuefeng and Wang, Xin and Gozum, Gabriel and Li, Yixuan},
journal={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year={2022}
}