Deep Two-View Structure-from-Motion Revisited
This repository provides the code for our CVPR 2021 paper Deep Two-View Structure-from-Motion Revisited.
We have a plan to re-org the codes around May 2022. Please feel free to submit issues if you feel confused about some parts.
We have provided the functions for training, validating, and visualization.
Requirements
Python = 3.6.x
Pytorch >= 1.6.0
CUDA >= 10.1
and the others could be installed by
pip install -r requirements.txt
Pytorch from 1.1.0 to 1.6.0 should also work well, but it will disenable mixed precision training, and we have not tested it.
To use the RANSAC five-point algorithm, you also need to
cd RANSAC_FiveP
python setup.py install --user
The CUDA extension would be installed as 'essential_matrix'. Tested under Ubuntu and CUDA 10.1.
Models
Pretrained models are provided here.
KITTI Depth
To reproduce our results, please first download the KITTI dataset RAW data and 14GB official depth maps. Please first unzip the KITTI official depth maps (train and val) into a folder, and change the flag cfg.GT_DEPTH_DIR in kitti.yml to the folder name. You should also download the split files provided by us, and unzip them into the root of the KITTI raw data.
For training,
python main.py -b 32 --lr 0.0005 --nlabel 128 --fix_flownet \
--data PATH/TO/YOUR/KITTI/DATASET --cfg cfgs/kitti.yml \
--pretrained-depth depth_init.pth.tar --pretrained-flow flow_init.pth.tar
For evaluation,
python main.py -v -b 1 -p 1 --nlabel 128 \
--data PATH/TO/YOUR/KITTI/DATASET --cfg cfgs/kitti.yml \
--pretrained kitti.pth.tar"
The default evaluation split is Eigen, where the metric abs_rel should be around 0.053 and rmse should be close to 2.22 (if 'loading official ground truth depth').
If you would like to use the Eigen SfM split, please set cfg.EIGEN_SFM = True and cfg.KITTI_697 = False.
KITTI Pose
For fair comparison, we use a KITTI odometry evaluation toolbox as provided here. Please generate poses by sequence, and evaluate the results correspondingly.
Acknowledgment:
Thanks Shihao Jiang and Dylan Campbell for sharing the implementation of the GPU-accelerated RANSAC Five-point algorithm. We really appreciate the valuable feedback from our area chairs and reviewers. We would like to thank Charles Loop for helpful discussions and Ke Chen for providing field test images from NVIDIA AV cars.
BibTex:
@article{wang2021deep,
title={Deep Two-View Structure-from-Motion Revisited},
author={Wang, Jianyuan and Zhong, Yiran and Dai, Yuchao and Birchfield, Stan and Zhang, Kaihao and Smolyanskiy, Nikolai and Li, Hongdong},
journal={CVPR},
year={2021}
}