• Stars
    star
    163
  • Rank 224,319 (Top 5 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

HAIS

Hierarchical Aggregation for 3D Instance Segmentation [ICCV 2021]

by
Shaoyu Chen, Jiemin Fang, Qian Zhang, Wenyu Liu, Xinggang Wang†. (†: corresponding author)


PWC PWC

Update

2022.4.28:

  • HAIS serves as a baseline of STPLS3D dataset. Code of HAIS on STPLS3D is available on [this Github repo] . STPLS3D is a large-scale photogrammetry 3D point cloud dataset, composed of high-quality, rich-annotated point clouds from real-world and synthetic environments.

STPLS3D_Learderboard

2021.9.30:

  • Code is released.
  • With better CUDA optimization, HAIS now only takes 339 ms on TITAN X, much better than the latency reported in the paper (410 ms on TITAN X).

Introduction

  • HAIS is an efficient and concise bottom-up framework (NMS-free and single-forward) for point cloud instance segmentation. It adopts the hierarchical aggregation (point aggregation and set aggregation) to generate instances and the intra-instance prediction for outlier filtering and mask quality scoring.

Framework

Learderboard

  • High speed. Thanks to the NMS-free and single-forward inference design, HAIS achieves the best inference speed among all existing methods. HAIS only takes 206 ms on RTX 3090 and 339 ms on TITAN X.
Method Per-frame latency on TITAN X
ASIS 181913 ms
SGPN 158439 ms
3D-SIS 124490 ms
GSPN 12702 ms
3D-BoNet 9202 ms
GICN 8615 ms
OccuSeg 1904 ms
PointGroup 452 ms
HAIS 339 ms

Installation

1) Environment

  • Python 3.x
  • Pytorch 1.1 or higher
  • CUDA 9.2 or higher
  • gcc-5.4 or higher

Create a conda virtual environment and activate it.

conda create -n hais python=3.7
conda activate hais

2) Clone the repository.

git clone https://github.com/hustvl/HAIS.git --recursive

3) Install the requirements.

cd HAIS
pip install -r requirements.txt
conda install -c bioconda google-sparsehash 

4) Install spconv

  • Verify the version of spconv.

    spconv 1.0, compatible with CUDA < 11 and pytorch < 1.5, is already recursively cloned in HAIS/lib/spconv in step 2) by default.

    For higher version CUDA and pytorch, spconv 1.2 is suggested. Replace HAIS/lib/spconv with this fork of spconv.

git clone https://github.com/outsidercsy/spconv.git --recursive
  Note:  In the provided spconv 1.0 and 1.2, spconv\spconv\functional.py is modified to make grad_output contiguous. Make sure you use the modified spconv but not the original one. Or there would be some bugs of optimization.
  • Install the dependent libraries.
conda install libboost
conda install -c daleydeng gcc-5 # (optional, install gcc-5.4 in conda env)
  • Compile the spconv library.
cd HAIS/lib/spconv
python setup.py bdist_wheel
  • Intall the generated .whl file.
cd HAIS/lib/spconv/dist
pip install {wheel_file_name}.whl

5) Compile the external C++ and CUDA ops.

cd HAIS/lib/hais_ops
export CPLUS_INCLUDE_PATH={conda_env_path}/hais/include:$CPLUS_INCLUDE_PATH
python setup.py build_ext develop

{conda_env_path} is the location of the created conda environment, e.g., /anaconda3/envs.

Data Preparation

1) Download the ScanNet v2 dataset.

2) Put the data in the corresponding folders.

  • Copy the files [scene_id]_vh_clean_2.ply, [scene_id]_vh_clean_2.labels.ply, [scene_id]_vh_clean_2.0.010000.segs.json and [scene_id].aggregation.json into the dataset/scannetv2/train and dataset/scannetv2/val folders according to the ScanNet v2 train/val split.

  • Copy the files [scene_id]_vh_clean_2.ply into the dataset/scannetv2/test folder according to the ScanNet v2 test split.

  • Put the file scannetv2-labels.combined.tsv in the dataset/scannetv2 folder.

The dataset files are organized as follows.

HAIS
β”œβ”€β”€ dataset
β”‚   β”œβ”€β”€ scannetv2
β”‚   β”‚   β”œβ”€β”€ train
β”‚   β”‚   β”‚   β”œβ”€β”€ [scene_id]_vh_clean_2.ply & [scene_id]_vh_clean_2.labels.ply & [scene_id]_vh_clean_2.0.010000.segs.json & [scene_id].aggregation.json
β”‚   β”‚   β”œβ”€β”€ val
β”‚   β”‚   β”‚   β”œβ”€β”€ [scene_id]_vh_clean_2.ply & [scene_id]_vh_clean_2.labels.ply & [scene_id]_vh_clean_2.0.010000.segs.json & [scene_id].aggregation.json
β”‚   β”‚   β”œβ”€β”€ test
β”‚   β”‚   β”‚   β”œβ”€β”€ [scene_id]_vh_clean_2.ply 
β”‚   β”‚   β”œβ”€β”€ scannetv2-labels.combined.tsv

3) Generate input files [scene_id]_inst_nostuff.pth for instance segmentation.

cd HAIS/dataset/scannetv2
python prepare_data_inst.py --data_split train
python prepare_data_inst.py --data_split val
python prepare_data_inst.py --data_split test

Training

CUDA_VISIBLE_DEVICES=0 python train.py --config config/hais_run1_scannet.yaml 

Inference

1) To evaluate on validation set,

  • prepare the .txt instance ground-truth files as the following.
cd dataset/scannetv2
python prepare_data_inst_gttxt.py
  • set split and eval in the config file as val and True.

  • Run the inference and evaluation code.

CUDA_VISIBLE_DEVICES=0 python test.py --config config/hais_run1_scannet.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$

Pretrained model: Google Drive / Baidu Cloud [code: sh4t]. mAP/mAP50/mAP25 is 44.1/64.4/75.7.

2) To evaluate on test set,

  • Set (split, eval, save_instance) as (test, False, True).
  • Run the inference code. Prediction results are saved in HAIS/exp by default.
CUDA_VISIBLE_DEVICES=0 python test.py --config config/hais_run1_scannet.yaml --pretrain $PATH_TO_PRETRAIN_MODEL$

Visualization

We provide visualization tools based on Open3D (tested on Open3D 0.8.0).

pip install open3D==0.8.0
python visualize_open3d.py --data_path {} --prediction_path {} --data_split {} --room_name {} --task {}

Please refer to visualize_open3d.py for more details.

Demo:

Acknowledgement

The code is based on PointGroup and spconv. And thank STPLS3D for extending HAIS.

Contact

If you have any questions or suggestions about this repo, please feel free to contact me ([email protected]).

Citing HAIS

If you find HAIS is useful in your research or applications, please consider giving us a star 🌟 and citing HAIS by the following BibTeX entry.

@InProceedings{Chen_HAIS_2021_ICCV,
    author    = {Chen, Shaoyu and Fang, Jiemin and Zhang, Qian and Liu, Wenyu and Wang, Xinggang},
    title     = {Hierarchical Aggregation for 3D Instance Segmentation},
    booktitle = {ICCV},
    year      = {2021},
}

More Repositories

1

Vim

Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Python
1,808
star
2

YOLOP

You Only Look Once for Panopitic Driving Perception.(MIR2022οΌ‰
Python
1,804
star
3

4DGaussians

[CVPR 2024] 4D Gaussian Splatting for Real-Time Dynamic Scene Rendering
Jupyter Notebook
1,634
star
4

MapTR

[ICLR'23 Spotlight] MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction
Python
893
star
5

YOLOS

[NeurIPS 2021] You Only Look at One Sequence
Jupyter Notebook
810
star
6

SparseInst

[CVPR 2022] SparseInst: Sparse Instance Activation for Real-Time Instance Segmentation
Python
558
star
7

GaussianDreamer

GaussianDreamer: Fast Generation from Text to 3D Gaussians by Bridging 2D and 3D Diffusion Models (CVPR 2024)
Python
503
star
8

Matte-Anything

[Image and Vision Computing] Interactive Natural Image Matting with Segment Anything Models
Python
412
star
9

QueryInst

[ICCV 2021] Instances as Queries
Python
400
star
10

VAD

[ICCV 2023] VAD: Vectorized Scene Representation for Efficient Autonomous Driving
Python
385
star
11

TopFormer

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation, CVPR2022
Python
373
star
12

MIMDet

[ICCV 2023] You Only Look at One Partial Sequence
Python
326
star
13

TiNeuVox

TiNeuVox: Fast Dynamic Radiance Fields with Time-Aware Neural Voxels (SIGGRAPH Asia 2022)
Python
306
star
14

ViTMatte

[Information Fusion] Boosting Image Matting with Pretrained Plain Vision Transformers
Python
245
star
15

TeViT

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral
Python
234
star
16

GKT

Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer
Python
197
star
17

BMaskR-CNN

[ECCV 2020] Boundary-preserving Mask R-CNN
Python
184
star
18

VMA

A general map auto annotation framework based on MapTR, with high flexibility in terms of spatial scale and element type
Python
157
star
19

Symphonies

[CVPR 2024] Symphonies (Scene-from-Insts): Symphonize 3D Semantic Scene Completion with Contextual Instance Queries
Python
123
star
20

WeakTr

WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation
Python
116
star
21

LaneGAP

Lane Graph as Path: Continuity-preserving Path-wise Modeling for Online Lane Graph Construction
110
star
22

SparseTrack

Official PyTorch implementation of SparseTrack (the new version of code will come soon)
Python
108
star
23

CrossVIS

[ICCV 2021] Crossover Learning for Fast Online Video Instance Segmentation
Python
85
star
24

MSG-Transformer

MSG-Transformer: Exchanging Local Spatial Information by Manipulating Messenger Tokens (CVPR 2022)
Python
80
star
25

PolarDETR

73
star
26

BoxTeacher

[CVPR 2023] Exploring High-Quality Pseudo Masks for Weakly Supervised Instance Segmentation
Python
72
star
27

TinyDet

Python
68
star
28

GNeuVox

GNeuVox: Generalizable Neural Voxels for Fast Human Radiance Fields
Python
59
star
29

AziNorm

AziNorm: Exploiting the Radial Symmetry of Point Cloud for Azimuth-Normalized 3D Perception, CVPR 2022.
Python
53
star
30

Featurized-QueryRCNN

Featurized Query R-CNN
Python
46
star
31

RILS

[CVPR 2023] RILS: Masked Visual Reconstruction in Language Semantic Space (https://arxiv.org/abs/2301.06958)
Python
43
star
32

PD-Quant

[CVPR 2023] PD-Quant: Post-Training Quantization Based on Prediction Difference Metric
Python
39
star
33

MIM4D

MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning
36
star
34

NeuSample

Code of "NeuSample: Neural Sample Field for Efficient View Synthesis"
Python
35
star
35

SAUNet

A Simple Adaptive Unfolding Network for Hyperspectral Image Reconstruction
Python
29
star
36

Query6DoF

Query6DoF: Learning Sparse Queries as Implicit Shape Prior for Category-Level 6DoF Pose Estimation
Python
25
star
37

HDR-HexPlane

3DV 2024: Fast High Dynamic Range Radiance Fields for Dynamic Scenes
Python
25
star
38

WeakSAM

WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition
Python
24
star
39

ViTGaze

Python
23
star
40

CircuitFormer

[NeurIPS 2023] CircuitFormer: Circuit as Set of Points
Python
23
star
41

MMIL-Transformer

Python
19
star
42

LSFA

Real-Time and Accurate Object Detection in Compressed Video by Long Short-term Feature Aggregation
Python
19
star
43

EfficientPose

Cuda
18
star
44

OpenInst

Python
14
star
45

BoxCaseg

Jupyter Notebook
13
star
46

mancs

Mancs: A multi-task attentional network with curriculum sampling for person re-identification
Python
12
star
47

RND-SCI

A Range-Null Space Decomposition Approach for Fast and Flexible Spectral Compressive Imaging
Python
10
star
48

DGCN

Python
9
star
49

PySA

Pyramid Self-Attention for Semantic Segmentation
8
star
50

EM-OLN

Python
7
star
51

BCF

Xinggang Wang, Bin Feng, Xiang Bai, Wenyu Liu, and Longin Jan Latecki. Bag of Contour Fragments for Robust Shape Classification. Pattern Recognition, Volume 47, Issue 6, June 2014, Pages 2116-2125.
MATLAB
6
star
52

DiG

Python
2
star
53

tbcl

1
star
54

DeepTunel

Python
1
star