CVMI-Lab/PLA

Stars
260
Rank 157,189 (Top 4 %)
Language
Python
License
Apache License 2.0
Created about 2 years ago
Updated 6 months ago

CVMI-Lab/PLA

CVMI-Lab

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

(CVPR 2023) PLA: Language-Driven Open-Vocabulary 3D Scene Understanding & (CVPR2024) RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding

PLA: Language-Driven Open-Vocabulary 3D Scene Understanding

Runyu Ding^1*, Jihan Yang^1*, Chuhui Xue², Wenqing Zhang², Song Bai^2†, Xiaojuan Qi^1†,

¹The University of Hong Kong ²ByteDance

*equal contribution ⁺corresponding author

CVPR 2023

TL;DR: PLA leverages powerful VL foundation models to construct hierarchical 3D-text pairs for 3D open-world learning.


working space	piano	vending machine

project page | arXiv

TODO

Release caption processing code

Getting Started

Installation

Please refer to INSTALL.md for the installation.

Dataset Preparation

Please refer to DATASET.md for dataset preparation.

Training & Inference

Please refer to MODEL.md for training and inference scripts and pretrained models.

Citation

If you find this project useful in your research, please consider cite:

@inproceedings{ding2022language,
    title={PLA: Language-Driven Open-Vocabulary 3D Scene Understanding},
    author={Ding, Runyu and Yang, Jihan and Xue, Chuhui and Zhang, Wenqing and Bai, Song and Qi, Xiaojuan},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
    year={2023}
}

Acknowledgement

Code is partly borrowed from OpenPCDet, PointGroup and SoftGroup.

PAConv

(CVPR 2021) PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds

ST3D

(CVPR 2021 & T-PAMI 2022) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection & ST3D++: Denoised Self-training for Unsupervised Domain Adaptation on 3D Object Detection

UHDM

(ECCV2022) This is the official PyTorch implementation of ECCV2022 paper: Towards Efficient and Scale-Robust Ultra-High-Definition Image Demoireing

SyntheticData

Is synthetic data from generative models ready for image recognition?

SparseKD

(NeurlPS 2022) Towards Efficient 3D Object Detection with Knowledge Distillation

IST-Net

(ICCV2023) IST-Net: Prior-free Category-level Pose Estimation with Implicit Space Transformation

SlotCon

(NeurIPS 2022) Self-Supervised Visual Representation Learning with Semantic Grouping

SimGCD

(ICCV 2023) Parametric Classification for Generalized Category Discovery: A Baseline Study

VideoDemoireing

(CVPR 2022) Video Demoireing with Relation-Based Temporal Consistency

HybridNeuralRendering

(CVPR 2023) Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur

DARS

(ICCV 2021 Oral) Re-distributing Biased Pseudo Labels for Semi-supervised Semantic Segmentation: A Baseline Investigation.

SPS-Conv

(NeurlPS 2022) Spatial Pruned Sparse Convolution for Efficient 3D Object Detection

MarS3D

(CVPR 2023) MarS3D: A Plug-and-Play Motion-Aware Model for Semantic Segmentation on Multi-Scan 3D Point Clouds

CoDet

(NeurIPS2023) CoDet: Co-Occurrence Guided Region-Word Alignment for Open-Vocabulary Object Detection

KDEP

(CVPR2022) Official PyTorch Implementation of KDEP. Knowledge Distillation as Efficient Pre-training: Faster Convergence, Higher Data-efficiency, and Better Transferability

Total-Decom

DODA

(ECCV 2022) DODA: Data-oriented Sim-to-Real Domain Adaptation for 3D Semantic Segmentation

FS3D

(NeurlPS 2022) Prototypical VoteNet for Few-Shot 3D Point Cloud Object Detection

ResKD

[NeurIPS 2022] Official implementation of the paper "Rethinking Resolution in the Context of Efficient Video Recognition".

SC-GS

[CVPR 2024] Code for SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes

clip-beyond-tail

Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights

Jupyter Notebook

SyncOOD

(ECCV 2024) Can OOD Object Detectors Learn from Foundation Models?

Hybrid-Occ-SDF

This is the officially implementation of ICCV 2023 paper " Learning A Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation"