• Stars
    star
    154
  • Rank 242,095 (Top 5 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR 23'] GeoMVSNet: Learning Multi-View Stereo with Geometry Perception

GeoMVSNet: Learning Multi-View Stereo With Geometry Perception (CVPR 2023)


     

🔨 Setup

1.1 Requirements

Use the following commands to build the conda environment.

conda create -n geomvsnet python=3.8
conda activate geomvsnet
pip install -r requirements.txt

1.2 Datasets

Download the following datasets and modify the corresponding local path in scripts/data_path.sh.

DTU Dataset

Training data. We use the same DTU training data as mentioned in MVSNet and CasMVSNet, please refer to DTU training data and Depth raw for data download. Optional, you should download the Recitfied raw if you want to train the model in raw image resolution. Unzip and organize them as:

dtu/
├── Cameras
├── Depths
├── Depths_raw
├── Rectified
└── Rectified_raw (optional)

Testing data. For convenience, we use the DTU testing data processed by CVP-MVSNet. Also unzip and organize it as:

dtu-test/
├── Cameras
├── Depths
└── Rectified

Please note that the images and lighting here are consistent with the original dataset.

BlendedMVS Dataset

Download the low image resolution version of BlendedMVS dataset and unzip it as:

blendedmvs/
└── dataset_low_res
    ├── ...
    └── 5c34529873a8df509ae57b58

Tanks and Temples Dataset

Download the intermediate and advanced subsets of Tanks and Temples dataset and unzip them. If you want to use the short range version of camera parameters for Intermediate subset, unzip short_range_caemeras_for_mvsnet.zip and move cam_[] to the corresponding scenarios.

tnt/
├── advanced
│   ├── ...
│   └── Temple
│       ├── cams
│       ├── images
│       ├── pair.txt
│       └── Temple.log
└── intermediate
    ├── ...
    └── Train
        ├── cams
        ├── cams_train
        ├── images
        ├── pair.txt
        └── Train.log

🚂 Training

You can train GeoMVSNet from scratch on DTU dataset and BlendedMVS dataset. After suitable setting and training, you can get the training checkpoints model in checkpoints/[Dataset]/[THISNAME], and the following outputs lied in the folder:

  • events.out.tfevents*: you can use tensorboard to monitor the training process.
  • model_[epoch].ckpt: we save a checkpoint every --save_freq.
  • train-[TIME].log: logged the detailed training message, you can refer to appropiate indicators to judge the quality of training.

2.1 DTU

To train GeoMVSNet on DTU dataset, you can refer to scripts/dtu/train_dtu.sh, specify THISNAME, CUDA_VISIBLE_DEVICES, batch_size, etc. to meet your demand. And run:

bash scripts/dtu/train_dtu.sh

The default training strategy we provide is the distributed training mode. If you want to use the general training mode, you can refer to the following code.

general training script
CUDA_VISIBLE_DEVICES=0,1,2,3 python3 train.py ${@} \
    --which_dataset="dtu" --epochs=16 --logdir=$LOG_DIR \
    --trainpath=$DTU_TRAIN_ROOT --testpath=$DTU_TRAIN_ROOT \
    --trainlist="datasets/lists/dtu/train.txt" --testlist="datasets/lists/dtu/test.txt" \
    \
    --data_scale="mid" --n_views="5" --batch_size=16 --lr=0.025 --robust_train \
    --lrepochs="1,3,5,7,9,11,13,15:1.5"

It should be noted that two different training strategies need to adjust the batch_size and lr parameters to achieve the best training results.

2.2 BlendedMVS

To train GeoMVSNet on BlendedMVS dataset, you can refer to scripts/bled/train_blend.sh, and also specify THISNAME, CUDA_VISIBLE_DEVICES, batch_size, etc. to meet your demand. And run:

bash scripts/blend/train_blend.sh

By default, we use 7 viewpoints as input for the BlendedMVS training. Similarly, you can choose to use the distributed training mode or the general one as mentioned in 2.1.

⚗️ Testing

3.1 DTU

For DTU testing, we use model trained on DTU training dataset. You can basically download our DTU pretrained model and put it into checkpoints/dtu/geomvsnet/. And perform depth map estimation, point cloud fusion, and result evaluation according to the following steps.

  1. Run bash scripts/dtu/test_dtu.sh for depth map estimation. The results will be stored in outputs/dtu/[THISNAME]/, each scan folder holding depth_est and confidence, etc.

    • Use outputs/visual.ipynb for depth map visualization.
  2. Run bash scripts/dtu/fusion_dtu.sh for point cloud fusion. We provide 3 different fusion methods, and we recommend the open3d option by default. After fusion, you can get [FUSION_METHOD]_fusion_plys under the experiment output folder, point clouds of each testing scan are there.

    (Optional) If you want to use the "Gipuma" fusion method.
    1. Clone the edited fusibile repo.
    2. Refer to fusibile configuration blog (Chinese) for building details.
    3. Create a new python2.7 conda env.
      conda create -n fusibile python=2.7
      conda install scipy matplotlib
      conda install tensorflow==1.14.0
      conda install -c https://conda.anaconda.org/menpo opencv
    4. Use the fusibile conda environment for gipuma fusion method.
  3. Download the ObsMask and Points of DTU GT point clouds from the official website and organize them as:

    dtu-evaluation/
    ├── ObsMask
    └── Points
    
  4. Setup Matlab in command line mode, and run bash scripts/dtu/matlab_quan_dtu.sh. You can adjust the num_at_once config according to your machine's CPU and memory ceiling. After quantitative evaluation, you will get [FUSION_METHOD]_quantitative/ and [THISNAME].log just store the quantitative results.

3.2 Tanks and Temples

For testing on Tanks and Temples benchmark, you can use any of the following configurations:

  • Only train on DTU training dataset.
  • Only train on BlendedMVS dataset.
  • Pretrained on DTU training dataset and finetune on BlendedMVS dataset. (Recommend)

After your personal training, also follow these steps:

  1. Run bash scripts/tnt/test_tnt.sh for depth map estimation. The results will be stored in outputs/[TRAINING_DATASET]/[THISNAME]/.
    • Use outputs/visual.ipynb for depth map visualization.
  2. Run bash scripts/tnt/fusion_tnt.sh for point cloud fusion. We provide the popular dynamic fusion strategy, and you can tune the fusion threshold in fusions/tnt/dypcd.py.
  3. Follow the Upload Instructions on the T&T official website to make online submissions.

3.3 Custom Data (TODO)

GeoMVSNet can reconstruct on custom data. At present, you can refer to MVSNet to organize your data, and refer to the same steps as above for depth estimation and point cloud fusion.

💡 Results

Our results on DTU and Tanks and Temples Dataset are listed in the tables.

DTU Dataset Acc. ↓ Comp. ↓ Overall ↓
GeoMVSNet 0.3309 0.2593 0.2951
T&T (Intermediate) Mean ↑ Family Francis Horse Lighthouse M60 Panther Playground Train
GeoMVSNet 65.89 81.64 67.53 55.78 68.02 65.49 67.19 63.27 58.22
T&T (Advanced) Mean ↑ Auditorium Ballroom Courtroom Museum Palace Temple
GeoMVSNet 41.52 30.23 46.53 39.98 53.05 35.98 43.34

And you can download our Point Cloud and Estimated Depth for academic usage.

🌟 About Reproduce Paper Results

In our experiment, we found that the reproduction of MVS network is relatively difficult. Therefore, we summarize some of the problems encountered in our experiment as follows, hoping to be helpful to you.

Q1. GPU Architecture Matters.

There are two commonly used NVIDIA GPU series: GeForce RTX (e.g. 4090Ti, 3090Ti, 2090Ti) and Tesla (e.g. V100, T4). We find that there is generally no performance degradation in training and testing on the same series of GPUs. But on the contrary, for example, if you train on V100 and test on 3090Ti, the visual effect of the depth map looks exactly the same, but each pixel value is not exactly the same. We conjecture that the two series or architectures differ in numerical computation and processing precision.

Our pretrained model is trained on NVIDIA V100 GPUs.

Q2. Pytorch Version Matters.

Different Cuda versions will result in different optional Pytorch versions. Different torch versions will affect the accuracy of network training and testing. One of the reasons we found is that the implementation and parameter control of the F.grid_sample() are various in different versions of Pytorch.

Q3. Training Hyperparameters Matters.

In the era of neural network, hyperparameters really matter. We made some network hyperparameters tuning, but it may not be the same as your configuration. Most fundamentally, due to differences in GPU graphics memory, you need to synchronize batch_size and lr. And the schedule of learning rate also matters.

Q4. Testing Epoch Matters.

By default, our model will train 16 epochs. But how to select the best training model for testing to achieve the best performance? One solution is to use PyTorch-lightning. For simplicity, you can decide which checkpoint to use based on the .log file we provide.

Q5. Fusion Hyperparameters Matters.

For both DTU and T&T datasets, the hyperparameters of point cloud fusion greatly affect the final performance. We have provided different fusion strategies and easy access to adjust parameters. Maybe you need to know the temperament of your model.

Qx. Others, you can raise an issue if you meet other problems.


⚖️ Citation

@InProceedings{zhe2023geomvsnet,
  title={GeoMVSNet: Learning Multi-View Stereo With Geometry Perception},
  author={Zhang, Zhe and Peng, Rui and Hu, Yuxi and Wang, Ronggang},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={21508--21518},
  year={2023}
}

💌 Acknowledgements

This repository is partly based on MVSNet, MVSNet-pytorch, CVP-MVSNet, cascade-stereo, MVSTER.

We appreciate their contributions to the MVS community.

More Repositories

1

TJ-Graduation-Project-2021

Tongji Univ. Undergraduate Graduation Project 2021. | 🎉含: 同济er毕设答辩PPT模板
Python
250
star
2

MVS

Multi-View Stereo based on deep learning. | Learning notes, codes and more.
Python
88
star
3

Leetcode

Leetcode solutions @doubleZ0108 from Peking University.
Python
56
star
4

Operating-System

Operating System | Tongji Univ. SSE Course Projects
C#
31
star
5

Human-Computer-Interaction

Human-Computer Interaction | Tongji Univ. SSE Course Projects
C#
29
star
6

pkuthss-mac

北京大学学位论文Latex模板(for Mac),基于pkuthss v1.9.1修改,主要进行Mac的适配修改、个人最佳实践说明、配置踩坑经验
TeX
28
star
7

UML

YOUMU Online Bookstore | Tonji Univ. SSE System Analysis and Design
25
star
8

3Months-Farewell-3Years

2020保研经验分享|别留遗憾,也别后悔
C++
22
star
9

Play-with-NVIDIA-Jetson-Nano

Try Edge Computing devices from scratch --- NVIDIA Jetson Nano
C++
19
star
10

Software-Testing-Visual-Platform

Software Testing | Tongji Univ. SSE Course Project
Python
17
star
11

cherryOS

Personal Operating System | Tongji Univ. SSE Course Project
C
16
star
12

Data-Augmentation

General Data Augmentation Algorithms for Object Detection(esp. Yolo)
Python
15
star
13

Design-Pattern

Design Pattern | Tongji Univ. SSE Course Notes & Demos
Java
14
star
14

Software-Project-and-Process-Management

Software Project and Process Management | Tongji Univ. Software Engineering Course Design
13
star
15

Digital-Media-Technology-PKU

Fundamentals of Digital Media Technology(04713901) | Peking University ECE Course Materials
C
12
star
16

Computer-Vision-PKU

Computer Vision(04711432) | Peking Univ. ECE Course Materials
Jupyter Notebook
12
star
17

Software-Engineering-Economy

Software Engineering Economy | Tongji Univ. SSE Course Design
12
star
18

Art-of-CSS

🎨Awesome front-end design by native CSS and a little JS.
HTML
10
star
19

Computer-Vision

Computer Vision | Tongji Univ. SSE Course Notes & Projects
MATLAB
9
star
20

Culling-based-on-Unity

3 scenarios of different complexity for testing Backface, Occlusion and Small Feature(Contribution) culling
C#
8
star
21

Open-FIESTA-Summer-Workshop

Tsinghua Shenzhen International Graduate School Open FIESTA IID Summer Workshop(2020).
Python
8
star
22

Six-past-TwentyTwo

微信小程序: 二十二点零六|"每个夜晚都会遇见🌙"
JavaScript
7
star
23

Digital-Image-Processing

Digital Image Processing | Tongji Univ. SSE Course Notes & Projects
Python
7
star
24

iVlog

A vlog community —— "iVlog" | Tongji Univ. SSE Database Course Design
JavaScript
6
star
25

Panorama-Stitching

Computer Vision | Tongji Univ. SSE Course Project
Jupyter Notebook
6
star
26

Arrow.io

OOP(C++) | Tongji Univ. SSE Course Project
C++
6
star
27

Soul-Maze

Virtual Reality Game Design | Tongji Univ. SSE Course Project
C#
6
star
28

TJ-Memory

🚣‍♀️🚣🚣‍♂️济·忆 | Tongji Memory
JavaScript
6
star
29

doubleZ0108

Less is more.
6
star
30

Art-of-Photo-Wall-Gallery

🎂Awesome photo wall gallery as my 21st birthday gift.|照片墙
Python
6
star
31

Mathematical-Contest-in-Modeling

Mathematical Contest in Modeling | MCM(2020) Finalist, CUMCM(2019) 2nd
MATLAB
6
star
32

Wechat-Chat-Interface

基于原生CSS的微信聊天界面 | WeChat chat interface based on native CSS
CSS
5
star
33

Lost-Instrument

Lost Instrument --- Rediscover Chinese Fork Music
C#
5
star
34

Computer-Graphics

Computer Graphics | Tongji Univ. SSE Courses Notes
Python
5
star
35

Mathematic-Model

Basic Mathematic Model | Study Notes & Demos
Jupyter Notebook
5
star
36

OpenCV-4.2.0

Install OpenCV-4.2.0 all-in-one
C++
5
star
37

Master-the-Mainframe-2020

IBM Master the Mainframe(MTM) 2020 core materials & codes.
COBOL
4
star
38

Soft-Renderer

Computer Graphics | Tongji Univ. SSE Course Design
HTML
4
star
39

TJ-Mall

Tongji Mall | Tongji Univ. SSE Web Technology Course Project
CSS
4
star
40

FamilyFarmSeaside

Design Pattern | Tongji Univ. SSE Course Design(Use only for course projects)
Java
3
star
41

Automatic-Control-Principles

Automatic Control Principles | Tongji Univ.(Minor of Artificial Intelligence) Course Notes
3
star
42

doubleZ0108.github.io

JavaScript
2
star
43

markdown2mindnode

convert markdown list to mindnode format
Python
2
star
44

xLab-KnowledgeReasoning

Tongji Univ. xLab Knowledge Reasoning 2019
Python
2
star
45

Game-Development-based-on-Unity

Game Development | Tongji Univ. SSE Course Notes & Demos
C#
2
star
46

Instant-Noodles-Detection

Instant Noodles Detection based on Yolo | Tongji Univ. SSE Computer Vision Course Assignment
Jupyter Notebook
2
star
47

TJ-Christmas-Card

🎄来自同济圣诞老人的圣诞礼物 | Christmas Cards for TJer (Designed & Powered by Six-past-TwentyTwo)
Python
2
star
48

my-Java-study

Java
2
star
49

TodoMVC

📋TodoMVC based on native HTML+JS+CSS | Tongji Univ. SSE Web Programming Course Project
JavaScript
1
star
50

iLab-SummerResearch

Tongji Univ. iLab Summer Research
Jupyter Notebook
1
star
51

IDEA-Lab-Summer-Camp

ZJU IDEA Lab Summer Camp
C
1
star
52

3D-Animation-and-Post

3D Animation and Post | Tongji Univ. SSE Course Design
1
star
53

Calendar

📆A series of front-end calendars based on native HTML+JS+CSS.
CSS
1
star
54

my-LaTex-study

TeX
1
star
55

Data-Engineering

Data Engineering | Tongji Univ.(Minor of Artificial Intelligence) Course Projects
Python
1
star