• Stars
    star
    179
  • Rank 214,039 (Top 5 %)
  • Language
    Python
  • License
    Other
  • Created about 2 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Pytorch code for ECCV'22 paper. ShAPO: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization

ShAPO๐ŸŽฉ: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization

License: MIT

This repository is the pytorch implementation of our paper:

ShAPO: Implicit Representations for Multi-Object Shape, Appearance and Pose Optimization
Muhammad Zubair Irshad, Sergey Zakharov, Rares Ambrus, Thomas Kollar, Zsolt Kira, Adrien Gaidon
European Conference on Computer Vision (ECCV), 2022

[Project Page] [arXiv] [PDF] [Video] [Poster]

Explore CenterSnap in Colab

Previous ICRA'22 work:

CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation
Muhammad Zubair Irshad, Thomas Kollar, Michael Laskey, Kevin Stone, Zsolt Kira
International Conference on Robotics and Automation (ICRA), 2022

[Project Page] [arXiv] [PDF] [Video] [Poster]

Citation

If you find this repository useful, please consider citing:

@inproceedings{irshad2022shapo,
  title={ShAPO: Implicit Representations for Multi Object Shape Appearance and Pose Optimization},
  author={Muhammad Zubair Irshad and Sergey Zakharov and Rares Ambrus and Thomas Kollar and Zsolt Kira and Adrien Gaidon},
  journal={European Conference on Computer Vision (ECCV)},
  year={2022},
  url={https://arxiv.org/abs/2207.13691},
}

@inproceedings{irshad2022centersnap,
  title={CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation},
  author={Muhammad Zubair Irshad and Thomas Kollar and Michael Laskey and Kevin Stone and Zsolt Kira},
  journal={IEEE International Conference on Robotics and Automation (ICRA)},
  year={2022},
  url={https://arxiv.org/abs/2203.01929},
}

Contents

๐Ÿค Google Colab

If you want to experiment with ShAPO, we have written a Colab. It's quite comprehensive and easy to setup. It goes through the following experiments / ShAPO properties:

  • Single Shot inference
    • Visualize peak and depth output
    • Decode shape with predicted textures
    • Project 3D Pointclouds and 3D bounding boxes on 2D image
  • Shape, Appearance and Pose Optimization
    • Core optimization loop
    • Viusalizing optimized 3D output (i.e. textured asset creation)

๐Ÿ’ป Environment

Create a python 3.8 virtual environment and install requirements:

cd $ShAPO_Repo
conda create -y --prefix ./env python=3.8
conda activate ./env/
./env/bin/python -m pip install --upgrade pip
./env/bin/python -m pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html

The code was built and tested on cuda 10.2

๐Ÿ“Š Dataset

Download camera_train, camera_val, real_train, real_test, ground-truth annotations, camera_composed_depth, mesh models and eval_results provided by NOCS and nocs preprocess data.
Also download sdf_rgb_pretrained_weights. Unzip and organize these files in $ShAPO_Repo/data as follows:

data
โ”œโ”€โ”€ CAMERA
โ”‚   โ”œโ”€โ”€ train
โ”‚   โ””โ”€โ”€ val
โ”œโ”€โ”€ Real
โ”‚   โ”œโ”€โ”€ train
โ”‚   โ””โ”€โ”€ test
โ”œโ”€โ”€ camera_full_depths
โ”‚   โ”œโ”€โ”€ train
โ”‚   โ””โ”€โ”€ val
โ”œโ”€โ”€ gts
โ”‚   โ”œโ”€โ”€ val
โ”‚   โ””โ”€โ”€ real_test
โ”œโ”€โ”€ results
โ”‚   โ”œโ”€โ”€ camera
โ”‚   โ”œโ”€โ”€ mrcnn_results
โ”‚   โ”œโ”€โ”€ nocs_results
โ”‚   โ””โ”€โ”€ real
โ”œโ”€โ”€ sdf_rgb_pretrained
โ”‚   โ”œโ”€โ”€ LatentCodes
โ”‚   โ”œโ”€โ”€ Reconstructions
โ”‚   โ”œโ”€โ”€ ModelParameters
โ”‚   โ”œโ”€โ”€ OptimizerParameters
โ”‚   โ””โ”€โ”€ rgb_net_weights
โ””โ”€โ”€ obj_models
    โ”œโ”€โ”€ train
    โ”œโ”€โ”€ val
    โ”œโ”€โ”€ real_train
    โ”œโ”€โ”€ real_test
    โ”œโ”€โ”€ camera_train.pkl
    โ”œโ”€โ”€ camera_val.pkl
    โ”œโ”€โ”€ real_train.pkl
    โ”œโ”€โ”€ real_test.pkl
    โ””โ”€โ”€ mug_meta.pkl

Create image lists

./runner.sh prepare_data/generate_training_data.py --data_dir /home/ubuntu/shapo/data/nocs_data/

Now run distributed script to collect data locally in a few hours. The data would be saved under data/NOCS_data.

Note: The script uses multi-gpu and runs 8 workers per gpu on a 16GB GPU. Change worker_per_gpu variable depending on your GPU size.

python prepare_data/distributed_generate_data.py --data_dir /home/ubuntu/shapoplusplus/data/nocs_data --type camera_train

--type chose from 'camera_train', 'camera_val', 'real_train', 'real_val'

โœจ Training and Inference

ShAPO is a two-stage process; First, a single-shot network to predict 3D shape, pose and size codes along with segmentation masks in a per-pixel manner. Second, test-time optimization of joint shape, pose and size codes given a single-view RGB-D observation of a new instance.

  1. Train on NOCS Synthetic (requires 13GB GPU memory):
./runner.sh net_train.py @configs/net_config.txt

Note than runner.sh is equivalent to using python to run the script. Additionally it sets up the PYTHONPATH and ShAPO Enviornment Path automatically. Also note that this part of the code is similar to CenterSnap. We predict implicit shapes as SDF MLP instead of pointclouds and additionally also predict appearance embedding and object masks in this stage.

  1. Finetune on NOCS Real Train (Note that good results can be obtained after finetuning on the Real train set for only a few epochs i.e. 1-5):
./runner.sh net_train.py @configs/net_config_real_resume.txt --checkpoint \path\to\best\checkpoint
  1. Inference on a NOCS Real Test Subset

Download a small Real test subset from here, our shape and texture decoder pretrained checkpoints from here and shapo pretrained checkpoints on real dataset here. Unzip and organize these files in $ShAPO_Repo/data as follows:

test_data
โ”œโ”€โ”€ Real
โ”‚   โ”œโ”€โ”€ test
|   ckpts
โ””โ”€โ”€ sdf_rgb_pretrained
    โ”œโ”€โ”€ LatentCodes
    โ”œโ”€โ”€ LatentCodes
    โ”œโ”€โ”€ Reconstructions
    โ”œโ”€โ”€ ModelParameters
    โ”œโ”€โ”€ OptimizerParameters
    โ””โ”€โ”€ rgb_net_weights

Now run the inference script to visualize the single-shot predictions as follows:

bash
./runner.sh inference/inference_real.py @configs/net_config.txt --test_data_dir path_to_nocs_test_subset --checkpoint checkpoint_path_here

You should see the visualizations saved in results/ShAPO_real. Change the --ouput_path in *config.txt to save them to a different folder

  1. Optimization

This is the core optimization script to update latent shape and appearance codes along with 6D pose and sizes to better the fit the unseen single-view RGB-D observation. For a quick run of the core optimization loop along with visualization, see this notebook here

./runner.sh opt/optimize.py @configs/net_config.txt --data_dir /path/to/test_data_dir/ --checkpoint checkpoint_path_here

๐Ÿ“ FAQ

Please see FAQs from CenterSnap here

Acknowledgments

  • This code is built upon the implementation from CenterSnap

Related Work

Licenses

  • This repository is released under the CC BY-NC 4.0 license.

More Repositories

1

Awesome-Implicit-NeRF-Robotics

A comprehensive list of Implicit Representations and NeRF papers relating to Robotics/RL domain, including papers, codes, and related websites
1,229
star
2

Awesome-Robotics-3D

A curated list of 3D Vision papers relating to Robotics domain in the era of large models i.e. LLMs/VLMs, inspired by awesome-computer-vision, including papers, codes, and related websites
470
star
3

CenterSnap

Pytorch code for ICRA'22 paper: "Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation"
Jupyter Notebook
289
star
4

NeO-360

Pytorch code for ICCV'23 paper. NEO 360: Neural Fields for Sparse View Synthesis of Outdoor Scenes
Python
225
star
5

NeRF-MAE

[ECCV 2024] Pytorch code for our ECCV'24 paper NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields
Python
75
star
6

articulated-object-nerf

Experimental repo for modelling Articulated Object Neural Radiance Field
Python
47
star
7

manipulator_parameter_identification

Dynamic parameter identification of a 7-DOF robotic manipulator
C++
10
star
8

pointgoal_navigation_benchmarks

Supervised Learning Benchmarks for Point Goal Navigation in indoor cluttered environments in Habitat-API
Python
5
star
9

computer_vision_gatech

Computer Vision (CS 6476) Georgia Tech Assignment/Project Solutions
Jupyter Notebook
4
star
10

environement_perception_stack_for_self_driving_cars

Derivable surface estimation, Lane estimation, object and obstacle detection stack for self driving cars
Jupyter Notebook
2
star
11

zubair-irshad

2
star
12

7-DOF_Arm_Regression

Identifying the dynamic parameters of a 7-DOF robot arm for inverse dynamic control
MATLAB
1
star
13

complex_maze_navigation

ROS based turtlebot3 mobile robot navigator using sign recognition based on image classification.
Python
1
star
14

udacity_deep_rl

My solutions (with explanations) to the Udacity Deep Reinforcement Learning Nano Degree Program assignments, mini-projects and projects
Jupyter Notebook
1
star
15

imitation_learning

Python
1
star
16

zubair-irshad.github.io

HTML
1
star
17

sign_recognition_imageclassifier

Sign recognition using SVM classifier.
Python
1
star
18

mobilerobot_dynamic_obstacle_avoidance

Autonomous turtlebot3 navigation using given way points and uncertain obstacle avoidance
Python
1
star
19

classical_controllers_for_self_driving_car

Python
1
star