Pointwise Convolutional Neural Networks
This is the release of the code for the paper `Pointwise Convolutional Neural Networks' in CVPR 2018.
Usage
The code is tested in latest Ubuntu 18.04 LTS with CUDA 9.2 and Tensorflow 1.9.
First, we need to compile the convolution operator as follows:
cd tf_ops/conv3p/
chmod 777 tf_conv3p_compile.sh
./tf_conv3p_compile.sh -a
The result is a dynamic library file named tf_conv3p.so
. The Python training and evaluation code loads this library for pointwise convolution.
By default, the library contains both a CPU and a GPU implementation of the convolution operator. The use_gpu
flag in param.json
can be set to true
to enable the convolution on the GPU.
To train object classification, execute
python train_modelnet40_acsd.py [epoch]
To evaluate, execute
python eval_modelnet40_acsd.py [epoch]
By default, epoch
is 0 if it is not passed as a parameter to the above command. During training, the network is saved after each epoch. You can resume the training if the network was saved before. Just pass the epoch number to the training command.
Similar code structure is adopted for scene segmentation task. For this task, we also provide a re-implementation of PointNet in PyTorch based on the open source implementation by fxia22.
Training Data
- ModelNet40: 450 MB.
- SceneNN Segmentation: 5.5 GB. With 76 scenes, re-annotated with NYU-D v2 40 classes. 56 scenes for training and 20 scenes for testing.
- S3DIS Segmentation: 1.6 GB.
Troubleshooting
If you are using Tensorflow 1.4, you might want to try compiling with tf_conv3p_compile_tf14.sh
instead. It fixes some include paths due to nsync_cv.h
, and set the flag _GLIBCXX_USE_CXX11_ABI=0
to make it compatible to libraries compiled with GCC version earlier than 5.1.
Performance
As this is a custom convolution operator we built with minimum optimization tricks that we know, you might find it running more slowly than those Tensorflow built-in operators. Despite that, the experiments were done on NVIDIA GTX 1070, GTX 1080, and Titan X (first generation) without big issues.
It will take hours or 1-2 days depending on your setup to finish training for object recognition. For scene segmentation, it might take longer.
Dependencies
This code includes the following third party libraries and data:
-
Scaled exponential linear units (SeLU) for self-normalization in neural network.
-
ModelNet40 data from PointNet
-
Some other utility code from PointNet
-
h5py
Citation
Please cite our paper
@inproceedings{hua-pointwise-cvpr18,
title = {Pointwise Convolutional Neural Networks},
author = {Binh-Son Hua and Minh-Khoi Tran and Sai-Kit Yeung},
booktitle = {Computer Vision and Pattern Recognition (CVPR)},
year = {2018}
}
if you find this useful for your work.
Future work
We made this simple operator with the hope that existing techniques in 2D image understanding tasks can be brought to 3D in a more straightforward manner. More research along this direction is encouraged.
Please contact the authors at [email protected] if you have any queries.