• Stars
    star
    377
  • Rank 113,535 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Hierarchical Deep Stereo Matching on High Resolution Images, CVPR 2019.

Hierarchical Deep Stereo Matching on High Resolution Images

[project webpage]

Qualitative results on Middlebury:

Performance on Middlebury benchmark (y-axis: error, the lower the better):

Able to handle large view variation of high-res images (as a submodule in Open4D, CVPR 2020):

Requirements

  • tested with python 2.7.15 and 3.6.8
  • tested with pytorch 0.4.0, 0.4.1 and 1.0.0
  • a few packages need to be installed, for eamxple, texttable

Weights

Note: The .tar file can be directly loaded in pytorch. No need to uncompress it.

Inference

Test on CrusadeP and dancing stereo pairs:

CUDA_VISIBLE_DEVICES=3 python submission.py --datapath ./data-mbtest/   --outdir ./mboutput --loadmodel ./weights/final-768px.tar  --testres 1 --clean 1.0 --max_disp -1

Evaluate on Middlebury additional images:

CUDA_VISIBLE_DEVICES=3 python submission.py --datapath ./path_to_additional_images   --outdir ./output --loadmodel ./weights/final-768px.tar  --testres 0.5
python eval_mb.py --indir ./output --gtdir ./groundtruth_path

Evaluate on HRRS:

CUDA_VISIBLE_DEVICES=3 python submission.py --datapath ./data-HRRS/   --outdir ./output --loadmodel ./weights/final-768px.tar  --testres 0.5
python eval_disp.py --indir ./output --gtdir ./data-HRRS/

And use cvkit to visualize in 3D.

Example outputs

left image

3D projection

disparity map

uncertainty map (brighter->higher uncertainty)

Parameters

  • testres: 1 is full resolution, and 0.5 is half resolution, and so on
  • max_disp: maximum disparity range to search
  • clean: threshold of cleaning. clean=0 means removing all the pixels.

Data

train/val

test

High-res-real-stereo (HR-RS) It has been taken off due to licensing issue. Please use the Argoverse dataset.

Train

  1. Download and extract training data in folder /d/. Training data include Middlebury train set, HR-VS, KITTI-12/15, ETH3D, and SceneFlow.
  2. Run
CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py --maxdisp 384 --batchsize 28 --database /d/ --logname log1 --savemodel /somewhere/  --epochs 10
  1. Evalute on Middlebury additional images and KITTI validation set. After 40k iterations, average error on Middlebury additional images excluding Shopvac (perfect+imperfect, 24 stereo pairs in total) with half-res should be around 5.7.

Citation

@InProceedings{yang2019hsm,
author = {Yang, Gengshan and Manela, Joshua and Happold, Michael and Ramanan, Deva},
title = {Hierarchical Deep Stereo Matching on High-Resolution Images},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}

Acknowledgement

Part of the code is borrowed from MiddEval-SDK, PSMNet, FlowNetPytorch and pytorch-semseg. Thanks SorcererX for fixing version compatibility issues.