• Stars
    star
    122
  • Rank 292,031 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Multiview Stereo with Cascaded Epipolar RAFT (CER-MVS)

This repository contains the source code for our ECCV 2022 paper:

Multiview Stereo with Cascaded Epipolar RAFT

Zeyu Ma, Zachary Teed and Jia Deng

@inproceedings{ma2022multiview,
  title={Multiview Stereo with Cascaded Epipolar RAFT},
  author={Ma, Zeyu and Teed, Zachary and Deng, Jia},
  booktitle={Proceedings of the European conference on computer vision (ECCV)},
  year={2022}
}

Requirements

The code has been tested with PyTorch 1.7 and Cuda 11.0.

conda env create -f environment.yml
conda activate cer-mvs

# we use gcc9 to compile alt_cuda_corr
export TORCH_CUDA_ARCH_LIST="6.0;6.1;6.2;7.0;7.5;8.0"
cd alt_cuda_corr && python setup.py install && cd ..

Required Data

To evaluate/train CER-MVS, you will need to download the required datasets.

To download a sample set of DTU and the training set of Tanks and Temples for the demos, run

python download_demo_datasets.py

By default the code will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder.

β”œβ”€β”€ datasets
    β”œβ”€β”€ DTU
        β”œβ”€β”€ Cameras
            β”œβ”€β”€ pair.txt
            β”œβ”€β”€ *_cam.txt
        β”œβ”€β”€ Rectified
            β”œβ”€β”€ scan*
                β”œβ”€β”€ rect_*.png
        β”œβ”€β”€ Depths
            β”œβ”€β”€ scan*
                β”œβ”€β”€ depth_map_*.pfm
    β”œβ”€β”€ BlendedMVS
        β”œβ”€β”€ dataset_full_res_0-29
            β”œβ”€β”€ 5bfe5ae0fe0ea555e6a969ca/5bfe5ae0fe0ea555e6a969ca/5bfe5ae0fe0ea555e6a969ca (an example)
                β”œβ”€β”€ blended_images
                    β”œβ”€β”€ *.jpg
                β”œβ”€β”€ cams
                    β”œβ”€β”€ *_cam.txt
                    β”œβ”€β”€ pair.txt
                β”œβ”€β”€ rendered_depth_maps
                    β”œβ”€β”€ *.pfm
        β”œβ”€β”€ dataset_full_res_30-59
        β”œβ”€β”€ dataset_full_res_60-89
        β”œβ”€β”€ dataset_full_res_90-112
    β”œβ”€β”€ TanksAndTemples
        β”œβ”€β”€ tankandtemples
            β”œβ”€β”€ intermediate
                β”œβ”€β”€ Family (an example)
                    β”œβ”€β”€ cams
                        *_cam.txt
                    β”œβ”€β”€ Family.log
                    β”œβ”€β”€ images
                        β”œβ”€β”€ *.jpg
                    β”œβ”€β”€ pair.txt
            β”œβ”€β”€ advanced
        β”œβ”€β”€ training_input
            β”œβ”€β”€ Ignatius (an example)
                β”œβ”€β”€ cams
                    *_cam.txt
                β”œβ”€β”€ images
                    β”œβ”€β”€ *.jpg
                β”œβ”€β”€ pair.txt

Demos

One GPU with at least 24GB GPU memory is needed. (e.g. 3090)

Pretrained models can be downloaded at Google Drive. Then put them under a pretrained folder.

β”œβ”€β”€ pretrained
    β”œβ”€β”€ train_DTU.pth
    β”œβ”€β”€ train_BlendedMVS.pth

You can demo our trained model on scan3 of DTU and Ignatius, Meetingroom of Tanks and Temples by running:

python demo.py

This will output point clouds *.ply in default results folder together with visualized depth maps *.png (modify configs/demo.gin to specify a different output folder).

β”œβ”€β”€ results
    β”œβ”€β”€ scan3
        β”œβ”€β”€ depths
            β”œβ”€β”€ *.png
    β”œβ”€β”€ Ignatius
        β”œβ”€β”€ depths
            β”œβ”€β”€ *.png
    β”œβ”€β”€ Meetingroom
        β”œβ”€β”€ depths
            β”œβ”€β”€ *.png
    β”œβ”€β”€ scan3.ply
    β”œβ”€β”€ Ignatius.ply
    β”œβ”€β”€ Meetingroom.ply

Training

Train on DTU (We trained on two 3090 GPUs (24GB GPU memory each) for 6 days):

python train.py -g train_DTU -p 'train.name = "YOUR_MODEL_NAME"'

Train on BlendedMVS (We trained on two A6000 GPUs (48GB GPU memory each) for 4 days):

python train.py -g train_BlendedMVS -p 'train.name = "YOUR_MODEL_NAME"'

Model checkpoints are saved in checkpoints folder and tensorboard logs are in runs/YOUR_MODEL_NAME

Test

One GPU with at least 24GB GPU memory is needed. (e.g. 3090)

Depth Map Inference

DTU Val/Test Set:

# low res pass
python inference.py -g inference_DTU -p 'inference.scan = "YOUR_SCAN, e.g., scan3"' \
    'inference.num_frame = 10' \
    'inference.rescale = 1'
# high res pass
python inference.py -g inference_DTU -p 'inference.scan = "YOUR_SCAN, e.g., scan3"' \
    'inference.num_frame = 10' \
    'inference.rescale = 2'

Tanks and Temples:

# low res pass
python inference.py -g inference_TNT -p 'inference.scan = "YOUR_SCAN, e.g., Ignatius"' \
    'inference.num_frame = 15' \
    'inference.rescale = 1'
# high res pass
python inference.py -g inference_TNT -p 'inference.scan = "YOUR_SCAN, e.g., Ignatius"' \
    'inference.num_frame = 15' \
    'inference.rescale = 2'

Modify config files or gin parameter to change output location and loaded weights.

For submitting parallel GPU jobs there is a script: scripts/submit_depthmap.py. Modify submitter.gin and the datasets and splits in the script for your need, and run python scripts/submit_depthmap.py.

Multi Resolution Fusion

DTU Val/Test Set:

python multires.py -g inference_DTU -p 'multires.scan = "YOUR_SCAN, e.g., scan3"'

Tanks and Temples:

python multires.py -g inference_TNT -p 'multires.scan = "YOUR_SCAN, e.g., Ignatius"'

Point Cloud Fusion

DTU Val/Test Set:

python fusion.py -g inference_DTU -p 'fusion.scan = "YOUR_SCAN, e.g., scan3"'

Tanks and Temples:

python fusion.py -g inference_TNT -p 'fusion.scan = "YOUR_SCAN, e.g., Ignatius"'

Similarly, there is a script submitting the two fusion steps: scripts/submit_fusion.py.

Evaluation

Results on DTU test set

Acc. Comp. Overall.
0.359 0.305 0.332

Download the Points data in official DTU website. Follow the instructions of Matlab code in SampleSet data. Note in BaseEvalMain_web.m:

  • Create Results folder otherwise matlab code will have error.
  • Change light_string='l7' to light_string='l3', which means all lights on
  • Change method_string='Tola' to method_string='cer-mvs'. Then put resulting .ply to corresponding cer-mvs folder.
  • If you are evaluating individual scan, change UsedSets=GetUsedSets to the scans you are evaluating, e.g. UsedSets=[3]
  • For some scans like scan3, there is no ObsMask/Plane, so in PointCompareMain.m, change load([dataPath '\ObsMask\Plane' num2str(cSet)],'P') to
if ~exist([dataPath '/ObsMask/Plane' num2str(cSet) '.mat'],'file')
    P = [0; 0; 0; 1]
else
    load([dataPath '/ObsMask/Plane' num2str(cSet)],'P')
end

Run

matlab -nodisplay -nosplash -nodesktop -r "run('BaseEvalMain_web.m');exit;"

After you get results for all scans, to get the summary (change UsedSets and method_string in ComputeStat_web too):

matlab -nodisplay -nosplash -nodesktop -r "run('ComputeStat_web.m'); exit;"

Results on Tanks and Temples

Mean Family Francis Horse Lighthouse M60 Panther Playground Train
64.82 81.16 64.21 50.43 70.73 63.85 63.99 65.90 58.25
Mean Auditorium Ballroom Courtroom Museum Palace Temple
40.19 25.95 45.75 39.65 51.75 35.08 42.97

Download official trainingdata. And clone the github repository. And convert camera poses to .log file. (For intermediate and advanced set they are already there in the preprocessed dataset, for training set, you can convert them yourself or download from here)

Run

python REPOSITORY_LOCATION/python_toolbox/evaluation/run.py --dataset-dir LOCATION_OF_trainingdata/SCAN(e.g. Ignatius) --traj-path LOCATION_OF_LOG_FILE --ply-path YOUR_POINT_CLOUD.ply --out-dir LOCATION_TO_SAVE_RESULTS

More Repositories

1

infinigen

Infinite Photorealistic Worlds using Procedural Generation
Python
5,286
star
2

RAFT

Python
3,189
star
3

CornerNet

Python
2,355
star
4

CornerNet-Lite

Python
1,780
star
5

DROID-SLAM

Python
1,730
star
6

lietorch

Cuda
670
star
7

RAFT-Stereo

Python
667
star
8

DeepV2D

Python
651
star
9

DPVO

Deep Patch Visual Odometry/SLAM
C++
597
star
10

pose-hg-train

Training and experimentation code used for "Stacked Hourglass Networks for Human Pose Estimation"
Jupyter Notebook
575
star
11

pytorch_stacked_hourglass

Pytorch implementation of the ECCV 2016 paper "Stacked Hourglass Networks for Human Pose Estimation"
Python
469
star
12

CoqGym

A Learning Environment for Theorem Proving with the Coq proof assistant
Coq
380
star
13

pose-ae-train

Training code for "Associative Embedding: End-to-End Learning for Joint Detection and Grouping"
Python
373
star
14

pose-hg-demo

Code to test and use the model from "Stacked Hourglass Networks for Human Pose Estimation"
Lua
316
star
15

SEA-RAFT

[ECCV2024 - Oral, Best Paper Award Candidate] SEA-RAFT: Simple, Efficient, Accurate RAFT for Optical Flow
Python
298
star
16

RAFT-3D

Python
229
star
17

SimpleView

Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"
Python
154
star
18

px2graph

Training code for "Pixels to Graphs by Associative Embedding"
Python
133
star
19

relative_depth

Code for the NIPS 2016 paper
Lua
124
star
20

YouTube3D

Code for the CVPR 2019 paper "Learning Single-Image Depth from Videos using Quality Assessment Networks"
Python
106
star
21

Coupled-Iterative-Refinement

Python
105
star
22

pose-ae-demo

Python
97
star
23

MultiSlam_DiffPose

Jupyter Notebook
94
star
24

SNP

Official code for View Synthesis with Sculpted Neural Points
Python
83
star
25

DecorrelatedBN

Code for Decorrelated Batch Normalization
Lua
80
star
26

SpatialSense

An Adversarially Crowdsourced Benchmark for Spatial Relation Recognition
Python
70
star
27

oasis

Code for the CVPR 2020 paper "OASIS: A Large-Scale Dataset for Single Image 3D in the Wild"
MATLAB
64
star
28

selfstudy

Code for reproducing experiments in "How Useful is Self-Supervised Pretraining for Visual Tasks?"
Python
60
star
29

PackIt

Code for reproducing results in ICML 2020 paper "PackIt: A Virtual Environment for Geometric Planning"
Jupyter Notebook
52
star
30

d3dhelper

Unofficial sample code for Distilled 3D Networks (D3D) in Tensorflow.
Jupyter Notebook
48
star
31

Oriented1D

Official code for ICCV 2023 paper "Convolutional Networks with Oriented 1D Kernels"
Python
44
star
32

SOLID

Python
41
star
33

OGNI-DC

[ECCV24] official code for "OGNI-DC: Robust Depth Completion with Optimization-Guided Neural Iterations"
Python
38
star
34

OcMesher

C++
35
star
35

attach-juxtapose-parser

Code for the paper "Strongly Incremental Constituency Parsing with Graph Neural Networks"
Python
34
star
36

surface_normals

Code for the ICCV 2017 paper "Surface Normals in the Wild"
Lua
33
star
37

MetaGen

Code for the paper "Learning to Prove Theorems by Learning to Generate Theorems"
Objective-C++
30
star
38

FormulaNet

Code for FormulaNet in NIPS 2017
Python
29
star
39

Rel3D

Official code for NeurRIPS 2020 paper "Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D"
Python
26
star
40

selfstudy-render

Code to generate datasets used in "How Useful is Self-Supervised Pretraining for Visual Tasks?"
Python
22
star
41

think_visually

Code for ACL 2018 paper 'Think Visually: Question Answering through Virtual Imagery'
Python
14
star
42

structured-matching

codes for ECCV 2016
Lua
9
star
43

DPVO_Docker

Shell
8
star
44

uniloss

Python
8
star
45

MetaQNL

Learning Symbolic Rules for Reasoning in Quasi-Natural Language: https://arxiv.org/abs/2111.12038
Julia
6
star
46

PackIt_Extra

Code for generating data in ICML 2020 paper "PackIt: A Virtual Environment for Geometric Planning"
C#
5
star
47

Rel3D_Render

Code for rendering images for NeurRIPS 2020 paper "Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D"
Python
3
star
48

HYPE-C

Python
1
star