• Stars
    star
    234
  • Rank 171,630 (Top 4 %)
  • Language
    Python
  • License
    Other
  • Created about 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Pytorch implementation of LOST unsupervised object discovery method

LOST

Pytorch implementation of the unsupervised object discovery method LOST. More details can be found in the paper:

Localizing Objects with Self-Supervised Transformers and no Labels, BMVC 2021 [arXiv]
by Oriane Siméoni, Gilles Puy, Huy V. Vo, Simon Roburin, Spyros Gidaris, Andrei Bursuc, Patrick Pérez, Renaud Marlet and Jean Ponce

LOST visualizations LOST visualizations LOST visualizations


If you use the LOST code or framework in your research, please consider citing:

@inproceedings{LOST,
   title = {Localizing Objects with Self-Supervised Transformers and no Labels},
   author = {Oriane Sim\'eoni and Gilles Puy and Huy V. Vo and Simon Roburin and Spyros Gidaris and Andrei Bursuc and Patrick P\'erez and Renaud Marlet and Jean Ponce},
   journal = {Proceedings of the British Machine Vision Conference (BMVC)},
   month = {November},
   year = {2021}
}

Content

LOST

Towards unsupervised object detection

Installation of LOST

Dependencies

This code was implemented with python 3.7, PyTorch 1.7.1 and CUDA 10.2. Please install PyTorch. In order to install the additionnal dependencies, please launch the following command:

pip install -r requirements.txt

Install DINO

This method is based on DINO paper. The framework can be installed using the following commands:

git clone https://github.com/facebookresearch/dino.git
cd dino; 
touch __init__.py
echo -e "import sys\nfrom os.path import dirname, join\nsys.path.insert(0, join(dirname(__file__), '.'))" >> __init__.py; cd ../;

The code was made using the commit ba9edd1 of DINO repo (please rebase if breakage).

Apply LOST to one image

Following are scripts to apply LOST to an image defined via the image_path parameter and visualize the predictions (pred), the maps of the Figure 2 in the paper (fms) and the visulization of the seed expansion (seed_expansion). Box predictions are also stored in the output directory given by parameter output_dir.

python main_lost.py --image_path examples/VOC07_000236.jpg --visualize pred
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize fms
python main_lost.py --image_path examples/VOC07_000236.jpg --visualize seed_expansion

Launching LOST on datasets

Following are the different steps to reproduce the results of LOST presented in the paper.

PASCAL-VOC

Please download the PASCAL VOC07 and PASCAL VOC12 datasets (link) and put the data in the folder datasets. There should be the two subfolders: datasets/VOC2007 and datasets/VOC2012. In order to apply lost and compute corloc results (VOC07 61.9, VOC12 64.0), please launch:

python main_lost.py --dataset VOC07 --set trainval
python main_lost.py --dataset VOC12 --set trainval

COCO

Please download the COCO dataset and put the data in datasets/COCO. Results are provided given the 2014 annotations following previous works. The following command line allows you to get results on the subset of 20k images of the COCO dataset (corloc 50.7), following previous litterature. To be noted that the 20k images are a subset of the train set.

python main_lost.py --dataset COCO20k --set train

Different models

We have tested the method on different setups of the VIT model, corloc results are presented in the following table (more can be found in the paper).

arch pre-training dataset
VOC07 VOC12 COCO20k
ViT-S/16 DINO 61.9 64.0 50.7
ViT-S/8 DINO 55.5 57.0 49.5
ViT-B/16 DINO 60.1 63.3 50.0
ResNet50 DINO 36.8 42.7 26.5
ResNet50 Imagenet 33.5 39.1 25.5


Previous results on the dataset VOC07 can be obtained by launching:

python main_lost.py --dataset VOC07 --set trainval #VIT-S/16
python main_lost.py --dataset VOC07 --set trainval --patch_size 8 #VIT-S/8
python main_lost.py --dataset VOC07 --set trainval --arch vit_base #VIT-B/16
python main_lost.py --dataset VOC07 --set trainval --arch resnet50 #Resnet50/DINO
python main_lost.py --dataset VOC07 --set trainval --arch resnet50_imagenet #Resnet50/imagenet

Towards unsupervised object detection

In this work, we additionally use LOST predictions to train object detection models without any human supervision. We explore two scenarios: class-agnostic (CAD) and (pseudo) class-aware training of object detectors (OD). The next section present the different steps to reproduce our results.

Installation for CAD and OD trainings

We use the detectron2 framework to train a Faster R-CNN model with LOST predictions as pseudo-gt. The code was developped with the version v0.5 of the framework. In order to reproduce our results, please install detectron2 using the next commands. In case of failure, you can find the installation corresponding to your version of pytorch/CUDA here.

git clone https://github.com/facebookresearch/detectron2.git
python -m pip install detectron2==0.5

Set global variables for ease of usage.

export LOST=$(pwd)
cd detectron2; export D2=$(pwd);

Then please copy LOST-specific files to detectron2 framework, following:

ln -s $LOST/tools/*.py $D2/tools/. # Move LOST tools to D2
mkdir $D2/configs/LOST
ln -s $LOST/tools/configs/* $D2/configs/LOST/. # Move LOST configs to D2

Training a Class-Agnostic Detector (CAD) with LOST pseudo-annotations

Before launching a training, data must be formated to fit detectron2 and COCO styles. Following are the command lines to do this formatting for boxes predicted with LOST.

cd $D2; 

# Format DINO weights to fit detectron2
wget https://dl.fbaipublicfiles.com/dino/dino_resnet50_pretrain/dino_resnet50_pretrain.pth -P ./data # Download the model from DINO
python tools/convert_pretrained_to_detectron_format.py --input ./data/dino_resnet50_pretrain.pth --output ./data/dino_RN50_pretrain_d2_format.pkl

# Format pseudo-boxes data to fit detectron2
python tools/prepare_voc_LOST_CAD_pseudo_boxes_in_detectron2_format.py --year 2007 --pboxes $LOST/data/LOST_predictions/LOST_VOC07.pkl

# Format VOC data to fit COCO style
python tools/prepare_voc_data_in_coco_style.py --is_CAD --voc07_dir $LOST/datasets/VOC2007 --voc12_dir $LOST/datasets/VOC2012

The next command line allows you to launch a CAD training with 4 gpus on the VOC2007 dataset. The batch size is set to 16, 4 to 8 GPUs may be needed depending on your machines. Please make sure to change the argument value MODEL.WEIGHTS to the correct path of DINO weights.

python tools/train_net_for_LOST_CAD.py --num-gpus 4 --config-file ./configs/LOST/RN50_DINO_FRCNN_VOC07_CAD.yaml DATALOADER.NUM_WORKERS 8 OUTPUT_DIR ./outputs/RN50_DINO_FRCNN_VOC07_CAD MODEL.WEIGHTS ./data/dino_RN50_pretrain_d2_format.pkl

Inference results of the model will be stored in $OUTPUT_DIR/inference. In order to produce results on the train+val dataset, please use the following command:

python tools/train_net_for_LOST_CAD.py --resume --eval-only --num-gpus 4 --config-file ./configs/LOST/RN50_DINO_FRCNN_VOC07_CAD.yaml DATALOADER.NUM_WORKERS 6 MODEL.WEIGHTS ./outputs/RN50_DINO_FRCNN_VOC07_CAD/model_final.pth OUTPUT_DIR ./outputs/RN50_DINO_FRCNN_VOC07_CAD/ DATASETS.TEST '("voc_2007_trainval_CAD_coco_style", )'
cd $LOST;
python main_corloc_evaluation.py --dataset VOC07 --set trainval --type_pred detectron --pred_file $D2/outputs/RN50_DINO_FRCNN_VOC07_CAD/inference/coco_instances_results.json

Training LOST+CAD on COCO20k dataset

Following are the command lines allowing to train a detector in a class-agnostic fashion on the COCO20k subset of COCO dataset.

cd $D2;

# Format pseudo-boxes data to fit detectron2
python tools/prepare_coco_LOST_CAD_pseudo_boxes_in_detectron2_format.py --pboxes $LOST/outputs/COCO20k_train/LOST-vit_small16_k/preds.pkl

# Generate COCO20k CAD gt annotations
python tools/prepare_coco_CAD_gt.py --coco_dir $LOST/datasets/COCO

# Train detector (evaluation done on COCO20k CAD training set)
python tools/train_net_for_LOST_CAD.py --num-gpus 4 --config-file ./configs/LOST/RN50_DINO_FRCNN_COCO20k_CAD.yaml DATALOADER.NUM_WORKERS 8 OUTPUT_DIR ./outputs/RN50_DINO_FRCNN_COCO20k_CAD MODEL.WEIGHTS ./data/dino_RN50_pretrain_d2_format.pkl

# Corloc evaluation
python main_corloc_evaluation.py --dataset COCO20k --type_pred detectron --pred_file $D2/outputs/RN50_DINO_FRCNN_COCO20k_CAD/inference/coco_instances_results.json

Evaluating LOST+CAD (corloc results)

We have provided predictions of a class-agnostic Faster R-CNN model trained using LOST boxes as pseudo-gt; they are stored in the folder data/CAD_predictions. In order to launch the corloc evaluation, please launch the following scripts. It is to be noted that in this evaluation, only the box with the highest confidence score is considered per image.

python main_corloc_evaluation.py --dataset VOC07 --set trainval --type_pred detectron --pred_file data/CAD_predictions/LOST_plus_CAD_VOC07.json
python main_corloc_evaluation.py --dataset VOC12 --set trainval --type_pred detectron --pred_file data/CAD_predictions/LOST_plus_CAD_VOC12.json
python main_corloc_evaluation.py --dataset COCO20k --set train --type_pred detectron --pred_file data/CAD_predictions/LOST_plus_CAD_COCO20k.json

The following table presents the obtained corloc results.

method dataset
VOC07 VOC12 COCO20k
LOST 61.9 64.0 50.7
LOST+CAD 65.7 70.4 57.5

Training a Class-Aware Detector (OD) with LOST pseudo-annotations

Following are the different steps to train a class-aware detector using LOST peusdo-boxes for the dataset VOC07. We provide LOST boxes correspoding to the dataset VOC07 in $LOST/data/LOST_predictions/LOST_VOC07.pkl.

cd $LOST;
# Cluster features of LOST boxes
python cluster_for_OD.py --pred_file $LOST/data/LOST_predictions/LOST_VOC07.pkl --nb_clusters 20 --dataset VOC07 --set trainval

cd $D2;
# Format DINO weights to fit detectron2
wget https://dl.fbaipublicfiles.com/dino/dino_resnet50_pretrain/dino_resnet50_pretrain.pth -P ./data # Download the model from DINO
python tools/convert_pretrained_to_detectron_format.py --input ./data/dino_resnet50_pretrain.pth --output ./data/dino_RN50_pretrain_d2_format.pkl

# Prepare the clustered LOST pseudo-box data for training
python tools/prepare_voc_LOST_OD_pseudo_boxes_in_detectron2_format.py --year 2007 --pboxes $LOST/data/LOST_predictions/LOST_VOC07_clustered_20clu.pkl

# Format VOC data to fit COCO style
python tools/prepare_voc_data_in_coco_style.py --voc07_dir  $LOST/datasets/VOC2007 --voc12_dir $LOST/datasets/VOC2012

# Train the detector on VOC2007 trainval set -- please be aware that no hungarian matching is used during training, so validation restuls are not meaningful (will be close to 0). Please use command bellow in order to evaluate results correctly. 
python tools/train_net_for_LOST_OD.py --num-gpus 8 --config-file ./configs/LOST/RN50_DINO_FRCNN_VOC07_OD.yaml DATALOADER.NUM_WORKERS 8 OUTPUT_DIR ./outputs/RN50_DINO_FRCNN_VOC07_OD MODEL.WEIGHTS ./data/dino_RN50_pretrain_d2_format.pkl

# Evaluate the detector results using hungarian matching -- allows to reproduce results from the paper
cd $LOST;
python tools/evaluate_unsupervised_detection_voc.py --results ./detectron2/outputs/RN50_DINO_FRCNN_VOC07_OD/inference/coco_instances_results.json

Training details

We use the R50-C4 model of Detectron2 with ResNet50 pre-trained with DINO self-supervision model.

Details:

  • mini-batches of size 16 across 8 GPUs using SyncBatchNorm
  • extra BatchNorm layer for the RoI head after conv5, i.e., Res5ROIHeadsExtraNorm layer in Detectron2
  • frozen first two convolutional blocks of ResNet-50, i.e., conv1 and conv2 in Detectron2.
  • learning rate is first warmed-up for 100 steps to 0.02 and then reduced by a factor of 10 after 18K and 22K training steps
  • we use in total 24K training steps for all the experiments, except when training class-agnostic detectors on the pseudo-boxes of the VOC07 trainval set, in which case we use 10K steps.

License

LOST is released under the Apache 2.0 license.

More Repositories

1

WoodScape

The repository containing tools and information about the WoodScape dataset.
Python
609
star
2

ADVENT

Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation
Python
379
star
3

xmuda

Cross-Modal Unsupervised Domain Adaptationfor 3D Semantic Segmentation
Python
192
star
4

ZS3

Zero-Shot Semantic Segmentation
Python
187
star
5

POCO

Python
185
star
6

SLidR

Official PyTorch implementation of "Image-to-Lidar Self-Supervised Distillation for Autonomous Driving Data"
Python
177
star
7

ALSO

ALSO: Automotive Lidar Self-supervision by Occupancy estimation
Python
166
star
8

ConfidNet

Addressing Failure Prediction by Learning Model Confidence
Python
163
star
9

RADIal

Jupyter Notebook
160
star
10

Maskgit-pytorch

Jupyter Notebook
148
star
11

BF3S

Boosting Few-Shot Visual Learning with Self-Supervision
Python
136
star
12

DADA

Depth-aware Domain Adaptation in Semantic Segmentation
Python
114
star
13

FLOT

FLOT: Scene Flow Estimation by Learned Optimal Transport on Point Clouds
Python
96
star
14

obow

Python
95
star
15

carrada_dataset

Jupyter Notebook
85
star
16

rangevit

Python
77
star
17

PointBeV

Official implementation of PointBeV: A Sparse Approach to BeV Predictions
Python
77
star
18

rainbow-iqn-apex

Distributed Rainbow-IQN for Atari
Python
76
star
19

BEVContrast

BEVContrast: Self-Supervision in BEV Space for Automotive Lidar Point Clouds - Official PyTorch implementation
Python
68
star
20

FOUND

PyTorch code for Unsupervised Object Localization: Observing the Background to Discover Objects
Python
66
star
21

Awesome-Unsupervised-Object-Localization

Curated list of awesome works on unsupervised object localization in 2D images.
66
star
22

LightConvPoint

Python
64
star
23

MVRSS

Python
59
star
24

WaffleIron

Python
43
star
25

FKAConv

Python
42
star
26

LaRa

LaRa: Latents and Rays for Multi-Camera Bird’s-Eye-View Semantic Segmentation
Python
41
star
27

SALUDA

Public repository for the 3DV 2024 spotlight paper "SALUDA: Surface-based Automotive Lidar Unsupervised Domain Adaptation".
Python
38
star
28

ScaLR

PyTorch code and models for ScaLR image-to-lidar distillation method
Python
34
star
29

3DGenZ

Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"
Python
33
star
30

obsnet

Python
32
star
31

BUDA

Boundless Unsupervised Domain Adaptation in Semantic Segmentation
32
star
32

SemanticPalette

Semantic Palette: Guiding Scene Generation with Class Proportions
Python
29
star
33

xmuda_journal

[TPAMI] Cross-modal Learning for Domain Adaptation in 3D Semantic Segmentation
Python
29
star
34

GenVal

Reliability in Semantic Segmentation: Can We Use Synthetic Data? (ECCV 2024)
Jupyter Notebook
29
star
35

PCAM

Python
28
star
36

NeeDrop

NeeDrop: Self-supervised Shape Representation from Sparse Point Clouds using Needle Dropping
Python
27
star
37

MTAF

Multi-Target Adversarial Frameworks for Domain Adaptation in Semantic Segmentation
Python
23
star
38

ESL

ESL: Entropy-guided Self-supervised Learning for Domain Adaptation in Semantic Segmentation
Python
19
star
39

STEEX

STEEX: Steering Counterfactual Explanations with Semantics
Python
18
star
40

TTYD

Public repository for the ECCV 2024 paper "Train Till You Drop: Towards Stable and Robust Source-free Unsupervised 3D Domain Adaptation".
Python
18
star
41

OCTET

Python
17
star
42

CAB

Python
16
star
43

Occfeat

16
star
44

MuHDi

Official PyTorch implementation of "Multi-Head Distillation for Continual Unsupervised Domain Adaptation in Semantic Segmentation"
Python
15
star
45

diffhpe

Official code of "DiffHPE: Robust, Coherent 3D Human Pose Lifting with Diffusion"
Python
14
star
46

bravo_challenge

BRAVO Challenge Toolkit and Evaluation Code
Python
14
star
47

sfrik

Official code for "Self-supervised learning with rotation-invariant kernels"
Python
12
star
48

BEEF

Python
11
star
49

MFEval

[ICRA2024] Towards Motion Forecasting with Real-World Perception Inputs: Are End-to-End Approaches Competitive? This is the official implementation of the evaluation protocol proposed in this work for motion forecasting models with real-world perception inputs.
Python
10
star
50

MOCA

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments
Python
9
star
51

SP4ASC

Python
7
star
52

bownet

Learning Representations by Predicting Bags of Visual Words
7
star
53

QuEST

Python
5
star
54

UNIT

UNIT: Unsupervised Online Instance Segmentation through Time - Official PyTorch implementation
Python
5
star
55

PAFUSE

Official repository of PAFUSE
Python
5
star
56

dl_utils

The library used in the Valeo Deep learning training.
Python
3
star
57

tutorial-images

2
star
58

valeoai.github.io

JavaScript
1
star
59

MF_aWTA

This is official implementation for annealed Winner-Takes-All loss in <Annealed Winner-Takes-All for Motion Forecasting>.
1
star