IGEV-Stereo & IGEV-MVS (CVPR 2023)
This repository contains the source code for our paper:
Iterative Geometry Encoding Volume for Stereo Matching
Gangwei Xu, Xianqi Wang, Xiaohuan Ding, Xin Yang
Demos
Pretrained models can be downloaded from google drive
We assume the downloaded pretrained weights are located under the pretrained_models directory.
You can demo a trained model on pairs of images. To predict stereo for Middlebury, run
python demo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth
Comparison with RAFT-Stereo
Method | KITTI 2012 (3-noc) |
KITTI 2015 (D1-all) |
Memory (G) | Runtime (s) |
---|---|---|---|---|
RAFT-Stereo | 1.30 % | 1.82 % | 1.02 | 0.38 |
IGEV-Stereo | 1.12 % | 1.59 % | 0.66 | 0.18 |
Environment
- NVIDIA RTX 3090
- Python 3.8
- Pytorch 1.12
Create a virtual environment and activate it.
conda create -n IGEV_Stereo python=3.8
conda activate IGEV_Stereo
Dependencies
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch -c nvidia
pip install opencv-python
pip install scikit-image
pip install tensorboard
pip install matplotlib
pip install tqdm
pip install timm==0.5.4
Required Data
To evaluate/train IGEV-Stereo, you will need to download the required datasets.
By default stereo_datasets.py
will search for the datasets in these locations.
├── /data
├── sceneflow
├── frames_finalpass
├── disparity
├── KITTI
├── KITTI_2012
├── training
├── testing
├── vkitti
├── KITTI_2015
├── training
├── testing
├── vkitti
├── Middlebury
├── trainingH
├── trainingH_GT
├── ETH3D
├── two_view_training
├── two_view_training_gt
├── DTU_data
├── dtu_train
├── dtu_test
Evaluation
To evaluate on Scene Flow or Middlebury or ETH3D, run
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset sceneflow
or
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset middlebury_H
or
python evaluate_stereo.py --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --dataset eth3d
Training
To train on Scene Flow, run
python train_stereo.py --logdir ./checkpoints/sceneflow
To train on KITTI, run
python train_stereo.py --logdir ./checkpoints/kitti --restore_ckpt ./pretrained_models/sceneflow/sceneflow.pth --train_datasets kitti
Submission
For submission to the KITTI benchmark, run
python save_disp.py
MVS training and evaluation
To train on DTU, run
python train_mvs.py
To evaluate on DTU, run
python evaluate_mvs.py
Citation
If you find our work useful in your research, please consider citing our paper:
@inproceedings{xu2023iterative,
title={Iterative Geometry Encoding Volume for Stereo Matching},
author={Xu, Gangwei and Wang, Xianqi and Ding, Xiaohuan and Yang, Xin},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={21919--21928},
year={2023}
}
Acknowledgements
This project is heavily based on RAFT-Stereo, We thank the original authors for their excellent work.