• Stars
    star
    264
  • Rank 154,235 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official source code of Fast Point Transformer, CVPR 2022

Fast Point Transformer

Project Page | Paper

This repository contains the official source code and data for our paper:

Fast Point Transformer
Chunghyun Park, Yoonwoo Jeong, Minsu Cho, and Jaesik Park
POSTECH GSAI & CSE
CVPR, New Orleans, 2022.

An Overview of the proposed pipeline

Overview

This work introduces Fast Point Transformer that consists of a new lightweight self-attention layer. Our approach encodes continuous 3D coordinates, and the voxel hashing-based architecture boosts computational efficiency. The proposed method is demonstrated with 3D semantic segmentation and 3D detection. The accuracy of our approach is competitive to the best voxel based method, and our network achieves 129 times faster inference time than the state-of-the-art, Point Transformer, with a reasonable accuracy trade-off in 3D semantic segmentation on S3DIS dataset.

Citation

If you find our code or paper useful, please consider citing our paper:

@inproceedings{park2022fast,
 title={Fast Point Transformer},
 author={Park, Chunghyun and Jeong, Yoonwoo and Cho, Minsu and Park, Jaesik},
 booktitle={Proceedings of the {IEEE/CVF} Conference on Computer Vision and Pattern Recognition (CVPR)},
 month={June},
 year={2022},
 pages={16949-16958}
}

Experiments

1. S3DIS Area 5 test

We denote MinkowskiNet42 trained with this repository as MinkowskiNet42†. We use voxel size 4cm for both MinkowskiNet42† and our Fast Point Transformer.

Model Latency (sec) mAcc (%) mIoU (%) Reference
PointTransformer 18.07 76.5 70.4 Codes from the authors
MinkowskiNet42† 0.08 74.1 67.2 Checkpoint
Β Β + rotation average 0.66 75.1 69.0 -
FastPointTransformer 0.14 76.6 69.2 Checkpoint
Β Β + rotation average 1.13 77.6 71.0 -

2. ScanNetV2 validation

Model Voxel Size mAcc (%) mIoU (%) Reference
MinkowskiNet42 2cm 80.4 72.2 Official GitHub
MinkowskiNet42† 2cm 81.4 72.1 Checkpoint
FastPointTransformer 2cm 81.2 72.5 Checkpoint
MinkowskiNet42† 5cm 76.3 67.0 Checkpoint
FastPointTransformer 5cm 78.9 70.0 Checkpoint
MinkowskiNet42† 10cm 70.8 60.7 Checkpoint
FastPointTransformer 10cm 76.1 66.5 Checkpoint

Installation

This repository is developed and tested on

  • Ubuntu 18.04 and 20.04
  • Conda 4.11.0
  • CUDA 11.1 and 11.3
  • Python 3.8.13
  • PyTorch 1.7.1, 1.10.0, and 1.12.1
  • MinkowskiEngine 0.5.4

Environment Setup

You can install the environment by using the provided shell script:

~$ git clone --recursive [email protected]:POSTECH-CVLab/FastPointTransformer.git
~$ cd FastPointTransformer
~/FastPointTransformer$ bash setup.sh fpt
~/FastPointTransformer$ conda activate fpt

Training & Evaluation

First of all, you need to download the datasets (ScanNetV2 and S3DIS), and preprocess them as:

(fpt) ~/FastPointTransformer$ python src/data/preprocess_scannet.py # you need to modify the data path
(fpt) ~/FastPointTransformer$ python src/data/preprocess_s3dis.py # you need to modify the data path

And then, locate the provided meta data of each dataset (src/data/meta_data) with the preprocessed dataset following the structure below:

${data_dir}
β”œβ”€β”€ scannetv2
β”‚   β”œβ”€β”€ meta_data
β”‚   β”‚   β”œβ”€β”€ scannetv2_train.txt
β”‚   β”‚   β”œβ”€β”€ scannetv2_val.txt
β”‚   β”‚   └── ...
β”‚   └── scannet_processed
β”‚       β”œβ”€β”€ train
β”‚       β”‚   β”œβ”€β”€ scene0000_00.ply
β”‚       β”‚   β”œβ”€β”€ scene0000_01.ply
β”‚       β”‚   └── ...
β”‚       └── test
└── s3dis
    β”œβ”€β”€ meta_data
    β”‚   β”œβ”€β”€ area1.txt
    β”‚   β”œβ”€β”€ area2.txt
    β”‚   └── ...
    └── s3dis_processed
        β”œβ”€β”€ Area_1
        β”‚   β”œβ”€β”€ conferenceRoom_1.ply
        β”‚   β”œβ”€β”€ conferenceRoom_2.ply
        β”‚   └── ...
        β”œβ”€β”€ Area_2
        └── ...

After then, you can train and evalaute a model by using the provided python scripts (train.py and eval.py) with configuration files in the config directory. For example, you can train and evaluate Fast Point Transformer with voxel size 4cm on S3DIS dataset via the following commands:

(fpt) ~/FastPointTransformer$ python train.py config/s3dis/train_fpt.gin
(fpt) ~/FastPointTransformer$ python eval.py config/s3dis/eval_fpt.gin {checkpoint_file} # use -r option for rotation averaging.

Consistency Score

You need to generate predictions via the following command:

(fpt) ~/FastPointTransformer$ python -m src.cscore.prepare {checkpoint_file} -m {model_name} -v {voxel_size} # This takes hours.

Then, you can calculate the consistency score (CScore) with:

(fpt) ~/FastPointTransformer$ python -m src.cscore.calculate {prediction_dir} # This takes seconds.

3D Object Detection using VoteNet

Please refer this repository.

Acknowledgement

Our code is based on the MinkowskiEngine. We also thank Hengshuang Zhao for providing the code of Point Transformer. If you use our model, please consider citing them as well.

More Repositories

1

PyTorch-StudioGAN

StudioGAN is a Pytorch library providing implementations of representative Generative Adversarial Networks (GANs) for conditional/unconditional image generation.
Python
3,409
star
2

point-transformer

This is an unofficial implementation of the Point Transformer paper.
Python
487
star
3

SCNeRF

[ICCV21] Self-Calibrating Neural Radiance Fields
Python
464
star
4

PeRFception

[NeurIPS2022] Official implementation of PeRFception: Perception using Radiance Fields.
Python
328
star
5

Combinatorial-3D-Shape-Generation

An official repository of paper "Combinatorial 3D Shape Generation via Sequential Assembly", presented at NeurIPS 2020 Workshop on Machine Learning for Engineering Modeling, Simulation, and Design
Python
65
star
6

DHVR

[ICCV 2021] Deep Hough Voting for Robust Global Registration
Python
62
star
7

InstaOrder

Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)
Jupyter Notebook
34
star
8

Brick-by-Brick

Official repository of Brick-by-Brick, presented at NeurIPS-2021
Python
15
star
9

style-agnostic-RL

Python
15
star
10

NeRF-Downstream

Official implementation of the benchmarked 2D, 3D classficiation, and 3D semantic segmentation models on PeRFception.
Python
15
star
11

HighQualityFrameInterpolation

An official source code of Choi et al., High-quality Frame Interpolation via Tridirectional Inference, WACV 2021 paper.
Python
15
star
12

CHOIR

Official source code of "Stable and Consistent Prediction of 3D Characteristic Orientation vis Invariant Residual Learning", ICML 2023
14
star
13

daily-reading-group

14
star
14

circlegan

Python
9
star
15

useful_utils

We provide useful util functions. When adding a util function, please add a description of the util function.
Python
6
star
16

Geometric-Primitives

Python
5
star
17

GRLOV

Grasping with Reinforced Learning rObot in Virtual space - 2021 summer 학뢀생 κ°•ν™”ν•™μŠ΅ 연ꡬ참여 ν”„λ‘œμ νŠΈ
Python
4
star
18

NeRF-Factory

NeRF-Factory
2
star
19

LAGS

Python
2
star
20

nvsadapter

CSS
1
star