Self-Supervised Monocular Scene Flow Estimation
3D visualization of estimated depth and scene flow from two temporally consecutive images.
Intermediate frames are interpolated using the estimated scene flow. (fine-tuned model, tested on KITTI Benchmark)
This repository is the official PyTorch implementation of the paper:
   Self-Supervised Monocular Scene Flow Estimation
   Junhwa Hur and Stefan Roth
   CVPR, 2020 (Oral Presentation)
   Paper / Supplemental / Arxiv
- Contact: junhwa.hur[at]gmail.com
Getting started
This code has been developed with Anaconda (Python 3.7), PyTorch 1.2.0 and CUDA 10.0 on Ubuntu 16.04.
Based on a fresh Anaconda distribution and PyTorch installation, following packages need to be installed:
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch
pip install tensorboard
pip install pypng==0.0.18
pip install colorama
pip install scikit-image
pip install pytz
pip install tqdm==4.30.0
pip install future
Then, please excute the following to install the Correlation and Forward Warping layer:
./install_modules.sh
For PyTorch version > 1.3
Please put the align_corners=True
flag in the grid_sample
function in the following files:
augmentations.py
losses.py
models/modules_sceneflow.py
utils/sceneflow_util.py
Dataset
Please download the following to datasets for the experiment:
- KITTI Raw Data (synced+rectified data, please refer MonoDepth2 for downloading all data more easily)
- KITTI Scene Flow 2015
To save space, we also convert the KITTI Raw png images to jpeg, following the convention from MonoDepth:
find (data_folder)/ -name '*.png' | parallel 'convert {.}.png {.}.jpg && rm {}'
We also converted images in KITTI Scene Flow 2015 as well. Please convert the png images in image_2
and image_3
into jpg and save them into the seperate folder image_2_jpg
and image_3_jpg
.
To save space further, you can delete the velodyne point data in KITTI raw data and optionally download the Eigen Split Projected Depth for the monocular depth evaluation on the Eigen Split. We converted the velodyne point data of the Eigen Test images in the numpy array format using code from MonoDepth. After downloading and unzipping it, you can merge with the KITTI raw data folder.
Training and Inference
The scripts folder contains training/inference scripts of all experiments demonstrated in the paper (including ablation study).
For training, you can simply run the following script files:
Script | Training | Dataset |
---|---|---|
./train_monosf_selfsup_kitti_raw.sh |
Self-supervised | KITTI Split |
./train_monosf_selfsup_eigen_train.sh |
Self-supervised | Eigen Split |
Fine-tuning is done in two stages: (i) first finding the stopping point using train/valid split, and then (ii) fune-tuning using all data with the found iteration steps.
Script | Training | Dataset |
---|---|---|
./train_monosf_kitti_finetune_1st_stage.sh |
Semi-supervised finetuning | KITTI raw + KITTI 2015 |
./train_monosf_kitti_finetune_2st_stage.sh |
Semi-supervised finetuning | KITTI raw + KITTI 2015 |
In the script files, please configure these following PATHs for experiments:
EXPERIMENTS_HOME
: your own experiment directory where checkpoints and log files will be saved.KITTI_RAW_HOME
: the directory where KITTI raw data is located in your local system.KITTI_HOME
: the directory where KITTI Scene Flow 2015 is located in your local system.KITTI_COMB_HOME
: the directory where both KITTI Scene Flow 2015 and KITTI raw data are located.
For testing the pretrained models, you can simply run the following script files:
Script | Task | Training | Dataset |
---|---|---|---|
./eval_monosf_selfsup_kitti_train.sh |
MonoSceneFlow | Self-supervised | KITTI 2015 Train |
./eval_monosf_selfsup_kitti_test.sh |
MonoSceneFlow | Self-supervised | KITTI 2015 Test |
./eval_monosf_finetune_kitti_test.sh |
MonoSceneFlow | fine-tuned | KITTI 2015 Test |
./eval_monodepth_selfsup_kitti_train.sh |
MonoDepth | Self-supervised | KITTI test split |
./eval_monodepth_selfsup_eigen_test.sh |
MonoDepth | Self-supervised | Eigen test split |
- Testing on KITTI 2015 Test gives output images for uploading on the KITTI Scene Flow 2015 Benchmark.
- To save output image, please turn on
--save_disp=True
,--save_disp2=True
, and--save_flow=True
in the script.
Pretrained Models
The checkpoints folder contains the checkpoints of the pretrained models.
Pretrained models from the ablation study can be downloaded here: download link
Outputs and Visualization
Ouput images and visualization of the main experiments can be downloaded here:
- Self-supervised, tested on KITTI 2015 Train
- Self-supervised, tested on Eigen Test
- Fined-tuned, tested on KITTI 2015 Train
Acknowledgement
Please cite our paper if you use our source code.
@inproceedings{Hur:2020:SSM,
Author = {Junhwa Hur and Stefan Roth},
Booktitle = {CVPR},
Title = {Self-Supervised Monocular Scene Flow Estimation},
Year = {2020}
}
- Portions of the source code (e.g., training pipeline, runtime, argument parser, and logger) are from Jochen Gast
- MonoDepth evaluation utils from MonoDepth
- MonoDepth PyTorch Implementation from OniroAI / MonoDepth-PyTorch