• Stars
    star
    271
  • Rank 151,717 (Top 3 %)
  • Language
    Jupyter Notebook
  • License
    GNU General Publi...
  • Created almost 5 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching using pytorch-lightning

CasMVSNet_pl

Unofficial implementation of Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching using pytorch-lightning

Official implementation: CasMVSNet

Reference MVSNet implementation: MVSNet_pl

Update

  1. Implement groupwise correlation in Learning Inverse Depth Regression for Multi-View Stereowith Correlation Cost Volume. It achieves almost the same result as original variance-based cost volume, but with fewer parameters and consumes lower memory, so it is highly recommended to use (in contrast, the inverse depth sampling in that paper turns out to have no effect in my experiments, maybe because DTU is indoor dataset, and inverse depth improves outdoor dataset better). To activate, set --num_groups 8 in training.
  2. 2020/03/06: Add Tanks and temples evaluation!
  3. 2020/03/07: Add BlendedMVS evaluation!
  4. 2020/03/31: Add BlendedMVS training!
  5. 2020/04/30: Add point cloud to mesh guideline!

Installation

Hardware

  • OS: Ubuntu 16.04 or 18.04
  • NVIDIA GPU with CUDA>=10.0 (tested with 1 RTX 2080Ti)

Software

  • Python==3.7 (installation via anaconda is recommended, use conda create -n casmvsnet_pl python=3.7 to create a conda environment and activate it by conda activate casmvsnet_pl)
  • Python libraries
    • Install core requirements by pip install -r requirements.txt
    • Install Inplace-ABN by pip install inplace-abn

Training

Please see each subsection for training on different datasets. Available training datasets:

DTU dataset

Data download

Download the preprocessed DTU training data and Depth_raw from original MVSNet repo and unzip. For the description of how the data is created, please refer to the original paper.

Training model

Run (example)

python train.py \
   --dataset_name dtu \
   --root_dir $DTU_DIR \
   --num_epochs 16 --batch_size 2 \
   --depth_interval 2.65 --n_depths 8 32 48 --interval_ratios 1.0 2.0 4.0 \
   --optimizer adam --lr 1e-3 --lr_scheduler cosine \
   --exp_name exp

Note that the model consumes huge GPU memory, so the batch size is generally small.

See opt.py for all configurations.

Example training log

log1 log2

Metrics

The metrics are collected on the DTU val set.

resolution n_views abs_err acc_1mm acc_2mm acc_4mm GPU mem in GB
(train*/val)
Paper 1152x864 5 N/A N/A 82.6% 88.8% 10.0 / 5.3
This repo
(same as paper)
640x512 3 4.524mm 72.33% 84.35% 90.52% 8.5 / 2.1
This repo
(gwc**)
640x512 3 4.242mm 73.99% 85.85% 91.57% 6.5 / 2.1

*Training memory is measured on batch size=2 and resolution=640x512.

**Gwc with num_groups=8 with parameters --depth_interval 2.0 --interval_ratios 1.0 2.5 5.5 --num_epochs 50, see update 1. This implementation aims at maintaining the concept of cascade cost volume, and build new operations to further increase the accuracy or to decrease inference time/GPU memory.

Pretrained model and log

Download the pretrained model and training log in release. The above metrics of This repo (same as paper) correspond to this training but the model is saved on the 10th epoch (least val_loss but not the best in other metrics).


BlendedMVS

Run

python train.py \
   --dataset_name blendedmvs \
   --root_dir $BLENDEDMVS_LOW_RES_DIR \
   --num_epochs 16 --batch_size 2 \
   --depth_interval 192.0 --n_depths 8 32 48 --interval_ratios 1.0 2.0 4.0 \
   --optimizer adam --lr 1e-3 --lr_scheduler cosine \
   --exp_name exp

The --depth_interval 192.0 is the product of the coarsest n_depth and the coarsest --interval_ratio: 192.0=48x4.0.

Some modifications w.r.t original paper

Since BlendedMVS contains outdoor and indoor scenes with a large variety of depth ranges (some from 0.1 to 2 and some from 10 to 200, notice that these numbers are not absolute distance in mm, they're in some unknown units), it is difficult to evaluate the absolute accuracy (e.g. an error of 2 might be good for scenes with depth range 10 to 200, but terrible for scenes with depth range 0.1 to 2). Therefore, I decide to scale the depth ranges roughly to the same scale (about 100 to 1000). It is done here. In that way, the depth ranges of all scenes in BlendedMVS are scaled to approximately the same as DTU (425 to 935), so we can continue to use the same metrics (acc_1mm, etc) to evaluate predicted depth maps.

Another advantage of the above scaling trick is that when applying model pretrained on DTU to BlendedMVS, we can get better results since their depth range is now roughly the same; if we do without scaling, the model will yield very bad result if the original depth range is for example 0.1 to 2.

Pretrained model and log

Download the pretrained model and training log in release.


Some code tricks

Since MVS models consumes a lot of GPU memory, it is indispensable to do some code tricks to reduce GPU memory consumption. I tried the followings:

  • Replace BatchNorm+Relu with Inplace-ABN: Reduce the memory by ~15%!
  • del the tensor when it is never accessed later: Only helps a little.
  • Use a = a+b in training and a += b in testing: Reduce about 300MB (don't know the reason..)

Testing

For depth prediction example, see test.ipynb.

For point cloud fusion from depth prediction, please go to evaluations to see the general depth fusion method description, then go to dataset subdirectories for detailed results (qualitative and quantitative).

A video showing the point cloud for scan9 in DTU in different angles and me (click to link to YouTube): teaser

Point cloud to mesh

You can follow this great post to convert the point cloud into mesh file. Poisson’ reconstruction turns out to be a good choice. Here's what I get after tuning some parameters (the parameters should be scene-dependent, so you need to experiment by yourself): a

More Repositories

1

nerf_pl

NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning
Jupyter Notebook
2,686
star
2

ngp_pl

Instant-ngp in pytorch+cuda trained with pytorch-lightning (high quality with high speed, with only few lines of legible code)
Jupyter Notebook
1,239
star
3

VTuber_Unity

Use Unity 3D character and Python deep learning algorithms to stream as a VTuber!
Python
779
star
4

gaussian_splatting_notes

A detailed formulae explanation on gaussian splatting
477
star
5

pytorch-cppcuda-tutorial

tutorial for writing custom pytorch cpp+cuda kernel, applied on volume rendering (NeRF)
Cuda
362
star
6

OpenVTuberProject

Open Vtuber project containing all sub projects
238
star
7

nsff_pl

Neural Scene Flow Fields using pytorch-lightning, with potential improvements
Jupyter Notebook
222
star
8

nerf_Unity

Unity project for nerf_pl (Neural Radiance Fields)
C#
221
star
9

fish_detection

Fish detection using Open Images Dataset and Tensorflow Object Detection
Jupyter Notebook
124
star
10

Coordinate-MLPs

Experiments of coordinate MLPs
Python
93
star
11

RL

Jupyter Notebook
79
star
12

MVSNet_pl

MVSNet: Depth Inference for Unstructured Multi-view Stereo using pytorch-lightning
Jupyter Notebook
67
star
13

MINER_pl

Unofficial implementation (replicates paper results!) of MINER: Multiscale Implicit Neural Representations in pytorch-lightning
Jupyter Notebook
60
star
14

BlendedMVS_scenes

Quick lookup for BlendedMVS scenes
Python
51
star
15

ROS_notes

Personal notes of ROS usage
Jupyter Notebook
48
star
16

Unity_live_caption

Use Google Speech-to-Text API to do real-time live stream caption on Unity! Best when combined with your virtual character!
Python
36
star
17

python-ray-tracing-with-cuda-example

An example of cuda ray tracing in pure python syntax.
Python
33
star
18

pytorch-lightning-tutorial

Pytorch lightning tutorial using MNIST
Python
32
star
19

pytorch_cppcuda_practice

Practice to write cpp/cuda extension for pytorch
Cuda
27
star
20

hindsight_experience_replay

A tensorflow implementation of hindsight experience replay
Jupyter Notebook
16
star
21

kwea123

7
star
22

dino_pl

Reimplementation of Self-Supervised Vision Transformers with DINO in pytorch-lightning
Python
6
star
23

python-ray-tracing-with-numpy-example

Example of ray tracing with numpy in pure python syntax
4
star
24

kitti_bev_detection

Jupyter Notebook
4
star
25

bookkeeping

網頁語音記帳程式 - 利用Google Cloud Speech API 實現快速語音記帳
Python
4
star
26

facebook-bot

Python
2
star
27

cpp_data_algo

C++
1
star
28

frustum-pointnets-work

Jupyter Notebook
1
star
29

raspberry_pi3

Jupyter Notebook
1
star
30

cifar-10-cnn

Jupyter Notebook
1
star
31

kwea123.github.io

CSS
1
star
32

line-bot

Python
1
star
33

acoustic-indices

Jupyter Notebook
1
star