NeurVPS: Neural Vanishing Point Scanning via Conic Convolution
This repository contains the official PyTorch implementation of the paper: Yichao Zhou, Haozhi Qi, Jingwei Huang, Yi Ma. "NeurVPS: Neural Vanishing Point Scanning via Conic Convolution". NeurIPS 2019.
Introduction
NeurVPS is an end-to-end trainable deep network with geometry-inspired convolutional operators for detecting vanishing points in images. With the power of data-driven approaches and geometrical priors, NeurVPS is able to outperform the previous state-of-the-art vanishing point detection methods such as LSD/J-Linkage and Contour (TMM17).
Main Results
Qualitative Measures
SceneCity Urban 3D (SU3) | Natural Scene (TMM17) | ScanNet |
---|---|---|
Random sampled results can be found in the supplementary material of the paper.
Quantitative Measures
SceneCity Urban 3D (SU3) | Natural Scene (TMM17) | ScanNet |
---|---|---|
Here, the x-axis represents the angle accuracy of the detected vanishing points and the y-axis represents the percentage of the results whose error is less than that. Our conic convolutional networks outperform all the baseline methods and previous state-of-the-art vanishing point detection approaches, while naive CNN implementations might under-perform those traditional methods, especially in the high-accuracy regions.
Code Structure
Below is a quick overview of the function of each file.
########################### Data ###########################
data/ # default folder for placing the data
su3/ # folder for SU3 dataset
tmm17/ # folder for TMM17 dataset
scannet-vp/ # folder for ScanNet dataset
logs/ # default folder for storing the output during training
########################### Code ###########################
config/ # neural network hyper-parameters and configurations
su3.yaml # default parameters for SU3 dataset
tmm17.yaml # default parameters for TMM17 dataset
scannet.yaml # default parameters for scannet dataset
dataset/ # all scripts related to data generation
su3.py # script for pre-processing the SU3 dataset to npz
misc/ # misc scripts that are not important
find-radius.py # script for generating figure grids
neurvps/ # neurvps module so you can "import neurvps" in other scripts
models/ # neural network architectures
cpp/ # CUDA kernel for deformable convolution
deformable.py # python wrapper for deformable convolution layers
conic.py # conic convolution layers
hourglass_pose.py # backbone network
vanishing_net.py # main network
datasets.py # reading the training data
trainer.py # trainer
config.py # global variables for configuration
utils.py # misc functions
train.py # script for training the neural network
eval.py # script for evaluating a dataset from a checkpoint
Reproducing Results
Installation
For the ease of reproducibility, you are suggested to install miniconda (or anaconda if you prefer) before following executing the following commands.
git clone https://github.com/zhou13/neurvps
cd neurvps
conda create -y -n neurvps
source activate neurvps
# Replace cudatoolkit=10.1 with your CUDA version: https://pytorch.org/get-started/
conda install -y pytorch cudatoolkit=10.1 -c pytorch
conda install -y tensorboardx gdown -c conda-forge
conda install -y pyyaml docopt matplotlib scikit-image opencv tqdm
mkdir data logs
Downloading the Processed Datasets
Make sure curl
is installed on your system and execute
cd data
gdown 1yRwLv28ozRvjsf9wGwAqzya1xFZ5wYET -O su3.tar.xz
gdown 1rpQNbZQEUff2j2rxr3mBl6xohGFl6sLv -O tmm17.tar.xz
gdown 1y_O9PxZhJ_Ml297FgoWMBLvjC1BvTs9A -O scannet.tar.xz
tar xf su3.tar.xz
tar xf tmm17.tar.xz
tar xf scannet.tar.xz
rm *.tar.xz
cd ..
If gdown
does not work for you, you can download the pre-processed datasets
manually from our Google
Drive and proceed
accordingly.
Training
Execute the following commands to train the neural networks from scratch on 2 GPUs (GPU 0 and GPU 1, specified by -d 0,1
) with the default parameters:
python ./train.py -d 0,1 --identifier su3 config/su3.yaml
python ./train.py -d 0,1 --identifier tmm17 config/tmm17.yaml
python ./train.py -d 0,1 --identifier scannet config/scannet.yaml
The checkpoints and logs will be written to logs/
accordingly.
Note: For TMM17 dataset, due to its small size the model is more senstive to the initialization. You may need to train it multiple times to reach the same performance as the pre-trained model. For SU3, it has been reported that it is possible to achieve higher performance with 4-GPU training than the reported one in the paper, though the training process is more volatile.
Pre-trained Models
You can download our reference pre-trained models from Google Drive. Those pre-trained models should be able to reproduce the numbers in our paper.
Evaluation
Execute the following commands to compute and plot the angular accuracy (AA) curves with trained network checkpoints:
python eval.py -d 0 logs/YOUR_LOG/config.yaml logs/YOUR_LOG/checkpoint_best.pth.tar
FAQ
I do not understand your format of vanishing points.
Uncomment these lines or these lines to visualize vanishing points overlaid with 2D images.
What is the unit of focal length in the yaml and why do I need it?
A: The focal length in our implementation is in the unit of 2/w pixel (w is the image width. only a square image is supported). This follows the convention of the OpenGL projection matrix so that to make it resolution invariant. The focal length is used for uniform sampling of the position of vanishing points. If it is not known, you can set it to some common focal length for your categories of images, as we do in config/tmm17.yaml.
You can also check the function to_label
and to_pixel
, which use the focal length to convert the 3D line direction from and to a 2D vanishing point.
I have a question. How could I get help?
A: You can post an issue on Github, which might help other people that have the same question. You can also send me an email if you think that is more appropriate.
Acknowledgement
We thank Yikai Li from SJTU and Jiajun Wu from MIT for pointing out a bug in the data augmentation for the TMM17 Natural Scene dataset. This work is supported by a research grant from Sony Research.
Citing NeurVPS
If you find NeurVPS useful in your research, please consider citing:
@inproceedings{zhou2019neurvps,
author={Zhou, Yichao and Qi, Haozhi and Huang, Jingwei and Ma, Yi},
title={{NeurVPS}: Neural Vanishing Point Scanning via Conic Convolution},
booktitle={{NeurIPS}},
year={2019}
}