• Stars
    star
    321
  • Rank 130,752 (Top 3 %)
  • Language
    Python
  • License
    Other
  • Created almost 2 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR23 Highlight] Implementation for Panoptic Lifting

Panoptic Lifting for 3D Scene Understanding


arXiv | Video | Project Page

This repository contains the implementation for the paper:

Panoptic Lifting for 3D Scene Understanding with Neural Fields by Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Bulò, Norman Müller, Matthias Nießner, Angela Dai and Peter Kontschieder.

animated
Given posed RGB images, Panoptic Lifting optimizes a panoptic radiance field which can be queried for color, depth, semantics, and instances for any point in space. Our method lifts noisy and view-inconsistent machine generated 2D segmentation masks into a consistent 3D panoptic radiance field, without requiring further tracking supervision or 3D bounding boxes.

Dependencies

Install requirements from the project root directory:

pip install -r requirements.txt

In case errors show up for missing packages, install them manually.

Structure

Overall code structure is as follows:

Folder Description
config/ hydra default configs
data/ processed scenes for different datasets
dataset/ pytorch Dataset implementations
docs/ project webpage files
inference/ rendering and evaluation code for trained models
model/ implementations for radiance field representations, their renderers, and losses
pretrained-examples/ pretrained models for scenes from scannet, replica, hypersim and self-captured (in-the-wild)
resources/ mappings for scannet, 3d front, coco etc. and misc. mesh and blender files
runs/ model training logs and checkpoints go here in addition to wandb;
trainer/ pytorch-lightning module for training
util/ misc utilities for coloring, cameras, metrics, transforms, logging etc.

Pre-trained Models and Data

Download the pretrained models from here and the corresponding processed scene data from here. Extract both zips in the project root directory, such that trained models are in pretrained-examples/ directory and data is in data/ directory. More pretrained models and data from ScanNet dataset are also provided.

Running inference

To run inference use the following command

python inference/render_panopli.py <PATH_TO_CHECKPOINT> <IF_TEST_MODE>

This will render the semantics, surrogate-ids and visualizations to runs/<experiment> folder. When <IF_TEST_MODE> is True, the test set is rendered (input to the evaluation script later). When False, a custom trajectory stored in data/<dataset_name>/<scene_name>/trajectories/trajectory_blender.pkl is rendered.

Example:

python inference/render_panopli.py pretrained-examples/hypersim_ai001008/checkpoints/epoch=30-step=590148.ckpt False

Evaluation

Use the inference/evaluation.py script for calculating metrics on the folder generated by the inference/render_panopli.py script (make sure you render the test set, since labels are not available for novel trajectories).

Example:

python inference/evaluate.py --root_path data/replica/room_0 --exp_path runs/room_0_test_01171740_PanopLi_replicaroom0_easy-longshoreman

Training

For launching training, use the following command from project root

python trainer/train_panopli_tensorf.py experiment=<EXPERIMENT_NAME> dataset_root=<PATH_TO_SCENE> wandb_main=True <HYPERPARAMETER_OVERRIDES>

Some example trainings:

ScanNet

python trainer/train_panopli_tensorf.py experiment=scannet042302 wandb_main=True batch_size=4096 dataset_root="data/scannet/scene0423_02/"

Replica

python trainer/train_panopli_tensorf.py experiment=replicaroom0 wandb_main=True batch_size=4096 dataset_root="data/replica/room_0/" lambda_segment=0.75

HyperSim

python trainer/train_panopli_tensorf.py experiment=hypersim001008 wandb_main=True dataset_root="data/hypersim/ai_001_008/" lambda_dist_reg=0 val_check_interval=1 instance_optimization_epoch=4 batch_size=2048 max_epoch=34 late_semantic_optimization=4 segment_optimization_epoch=24 bbox_aabb_reset_epochs=[2,4,8] decay_step=[16,32,48] grid_upscale_epochs=[2,4,8,16,20] lambda_segment=0.5

Self Captured

python trainer/train_panopli_tensorf.py experiment=itw_office0213meeting_andram wandb_main=True batch_size=8192

Data Generation

Preprocessing scripts for data generation are provided in dataset/preprocessing/ for Hypersim, Replica, ScanNet datasets and in-the-wild captures. For generating training labels, use our test-time augmented version of mask2former from here.

ScanNet: For processing ScanNet folders you will need the scene folder containing .sens and the label zips.

Replica: Use the data provided by authors of SemanticNeRF and place it in data/replica/raw/from_semantic_nerf directory.

HyperSim: These scripts require the scene data in the raw folder in the data/hypersim/ directory. For example, for processing hypersim scene ai_001_008, you'd need the raw data in data/hypersim/raw/ai_001_008 directory. HyperSim raw data for a scene would typically contain the _detail and images directories.

Self Captured Data: The preprocessing scripts expect data/itw/raw/<scene_name> to have color directory and transforms.json file containing pose information (see InstantNGP to see how to generate this).

License

The majority of Panoptic Lifting is licensed under CC-BY-NC, however portions of the project are available under separate license terms: TensoRF and spherical_camera is licensed under the MIT license, Panoptic Quality is license under Apache license.

Citation

If you wish to cite us, please use the following BibTeX entry:

@misc{2212.09802,
    Author = {Yawar Siddiqui and Lorenzo Porzi and Samuel Rota Buló and Norman Müller and Matthias Nießner and Angela Dai and Peter Kontschieder},
    Title = {Panoptic Lifting for 3D Scene Understanding with Neural Fields},
    Year = {2022},
    Eprint = {arXiv:2212.09802},
}

More Repositories

1

mesh-gpt

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers
635
star
2

ViewAL

[CVPR'20] Implementation for the paper "ViewAL: Active Learning with Viewpoint Entropy for Semantic Segmentation"
Python
142
star
3

retrieval-fuse

[ICCV21] Code for "RetrievalFuse: Neural 3D Scene Reconstruction with a Database"
Python
81
star
4

stylegan2-ada-3d-texture

Texturify Code
Python
74
star
5

SRmeetsPS-CUDA

CUDA implementation of the paper "Depth Super-Resolution Meets Uncalibrated Photometric Stereo"
C++
33
star
6

stylegan2-ada-lightning

Simplified pytorch lightning port of StyleGAN2-ADA
Python
32
star
7

single-view-3d-reconstruction

Python
24
star
8

texturify

Pytorch implementation for paper "Texturify: Generating Textures on 3D Shape Surfaces"
16
star
9

shape_sdf

Simple MLP for representing the SDF of a single shape
Cuda
13
star
10

deepfillv2-pylightning

Clean minimal implementation of Free-Form Image Inpainting with Gated Convolutions in pytorch lightning. Inspired from pytorch implementation by @avalonstrel.
Python
13
star
11

pointcloud-visualization

for rgbd -> pointcloud projection reference
C++
7
star
12

deep-joint-clustering

An image clustering algorithm using neural networks
Python
6
star
13

CADTextures

Python
4
star
14

image-to-atlas

Project an image to the texture atlas for a given mesh
Python
4
star
15

deep-learning-experiments

Small experiments on MNIST to evaluate ES and GA against SGD
Python
3
star
16

simple_overfit_test

Weirdly challenging Simple MLP overfitting problem
Python
2
star
17

deep-active-semantic-segmentation

Active learning for semantic segmentation with deep neural networks - documentation coming soon
Python
2
star
18

BiGAN

Simple implementation of a BiGAN
Python
2
star
19

render-browser

Quick browsing and marking of rendered images
CSS
2
star
20

coordconv-pytorch

Pytorch lightning implementation of CoordConv regression experiments
Python
1
star
21

boring-pl-model

Simple pytorch lightning boring model
Python
1
star
22

quickshift-gpu

Quickshift on cuda
Cuda
1
star
23

stylegan2-ada-render

Python
1
star
24

gamma-spot-removal-gpu

Gamma spots removal for Neutron radiography and tomography images using 'find and replace' strategy, this is a discriminative method, better than unique threshold substitution
C++
1
star