• Stars
    star
    1,644
  • Rank 27,249 (Top 0.6 %)
  • Language
    Python
  • License
    Other
  • Created almost 4 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An extension of Open3D to address 3D Machine Learning tasks

ML

Ubuntu CI Style check PyTorch badge TensorFlow badge

Installation | Get started | Structure | Tasks & Algorithms | Model Zoo | Datasets | How-tos | Contribute

Open3D-ML is an extension of Open3D for 3D machine learning tasks. It builds on top of the Open3D core library and extends it with machine learning tools for 3D data processing. This repo focuses on applications such as semantic point cloud segmentation and provides pretrained models that can be applied to common tasks as well as pipelines for training.

Open3D-ML works with TensorFlow and PyTorch to integrate easily into existing projects and also provides general functionality independent of ML frameworks such as data visualization.

Installation

Users

Open3D-ML is integrated in the Open3D v0.11+ python distribution and is compatible with the following versions of ML frameworks.

  • PyTorch 1.8.2
  • TensorFlow 2.5.2
  • CUDA 10.1, 11.* (On GNU/Linux x86_64, optional)

You can install Open3D with

# make sure you have the latest pip version
pip install --upgrade pip
# install open3d
pip install open3d

To install a compatible version of PyTorch or TensorFlow you can use the respective requirements files:

# To install a compatible version of TensorFlow
pip install -r requirements-tensorflow.txt
# To install a compatible version of PyTorch
pip install -r requirements-torch.txt
# To install a compatible version of PyTorch with CUDA on Linux
pip install -r requirements-torch-cuda.txt

To test the installation use

# with PyTorch
$ python -c "import open3d.ml.torch as ml3d"
# or with TensorFlow
$ python -c "import open3d.ml.tf as ml3d"

If you need to use different versions of the ML frameworks or CUDA we recommend to build Open3D from source.

Getting started

Reading a dataset

The dataset namespace contains classes for reading common datasets. Here we read the SemanticKITTI dataset and visualize it.

import open3d.ml.torch as ml3d  # or open3d.ml.tf as ml3d

# construct a dataset by specifying dataset_path
dataset = ml3d.datasets.SemanticKITTI(dataset_path='/path/to/SemanticKITTI/')

# get the 'all' split that combines training, validation and test set
all_split = dataset.get_split('all')

# print the attributes of the first datum
print(all_split.get_attr(0))

# print the shape of the first point cloud
print(all_split.get_data(0)['point'].shape)

# show the first 100 frames using the visualizer
vis = ml3d.vis.Visualizer()
vis.visualize_dataset(dataset, 'all', indices=range(100))

Visualizer GIF

Loading a config file

Configs of models, datasets, and pipelines are stored in ml3d/configs. Users can also construct their own yaml files to keep record of their customized configurations. Here is an example of reading a config file and constructing modules from it.

import open3d.ml as _ml3d
import open3d.ml.torch as ml3d # or open3d.ml.tf as ml3d

framework = "torch" # or tf
cfg_file = "ml3d/configs/randlanet_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

# fetch the classes by the name
Pipeline = _ml3d.utils.get_module("pipeline", cfg.pipeline.name, framework)
Model = _ml3d.utils.get_module("model", cfg.model.name, framework)
Dataset = _ml3d.utils.get_module("dataset", cfg.dataset.name)

# use the arguments in the config file to construct the instances
cfg.dataset['dataset_path'] = "/path/to/your/dataset"
dataset = Dataset(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
model = Model(**cfg.model)
pipeline = Pipeline(model, dataset, **cfg.pipeline)

Semantic Segmentation

Running a pretrained model for semantic segmentation

Building on the previous example we can instantiate a pipeline with a pretrained model for semantic segmentation and run it on a point cloud of our dataset. See the model zoo for obtaining the weights of the pretrained model.

import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d

cfg_file = "ml3d/configs/randlanet_semantickitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.RandLANet(**cfg.model)
cfg.dataset['dataset_path'] = "/path/to/your/dataset"
dataset = ml3d.datasets.SemanticKITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
pipeline = ml3d.pipelines.SemanticSegmentation(model, dataset=dataset, device="gpu", **cfg.pipeline)

# download the weights.
ckpt_folder = "./logs/"
os.makedirs(ckpt_folder, exist_ok=True)
ckpt_path = ckpt_folder + "randlanet_semantickitti_202201071330utc.pth"
randlanet_url = "https://storage.googleapis.com/open3d-releases/model-zoo/randlanet_semantickitti_202201071330utc.pth"
if not os.path.exists(ckpt_path):
    cmd = "wget {} -O {}".format(randlanet_url, ckpt_path)
    os.system(cmd)

# load the parameters.
pipeline.load_ckpt(ckpt_path=ckpt_path)

test_split = dataset.get_split("test")
data = test_split.get_data(0)

# run inference on a single example.
# returns dict with 'predict_labels' and 'predict_scores'.
result = pipeline.run_inference(data)

# evaluate performance on the test set; this will write logs to './logs'.
pipeline.run_test()

Users can also use predefined scripts to load pretrained weights and run testing.

Training a model for semantic segmentation

Similar as for inference, pipelines provide an interface for training a model on a dataset.

# use a cache for storing the results of the preprocessing (default path is './logs/cache')
dataset = ml3d.datasets.SemanticKITTI(dataset_path='/path/to/SemanticKITTI/', use_cache=True)

# create the model with random initialization.
model = RandLANet()

pipeline = SemanticSegmentation(model=model, dataset=dataset, max_epoch=100)

# prints training progress in the console.
pipeline.run_train()

For more examples see examples/ and the scripts/ directories. You can also enable saving training summaries in the config file and visualize ground truth and results with tensorboard. See this tutorial for details.

3D Object Detection

Running a pretrained model for 3D object detection

The 3D object detection model is similar to a semantic segmentation model. We can instantiate a pipeline with a pretrained model for Object Detection and run it on a point cloud of our dataset. See the model zoo for obtaining the weights of the pretrained model.

import os
import open3d.ml as _ml3d
import open3d.ml.torch as ml3d

cfg_file = "ml3d/configs/pointpillars_kitti.yml"
cfg = _ml3d.utils.Config.load_from_file(cfg_file)

model = ml3d.models.PointPillars(**cfg.model)
cfg.dataset['dataset_path'] = "/path/to/your/dataset"
dataset = ml3d.datasets.KITTI(cfg.dataset.pop('dataset_path', None), **cfg.dataset)
pipeline = ml3d.pipelines.ObjectDetection(model, dataset=dataset, device="gpu", **cfg.pipeline)

# download the weights.
ckpt_folder = "./logs/"
os.makedirs(ckpt_folder, exist_ok=True)
ckpt_path = ckpt_folder + "pointpillars_kitti_202012221652utc.pth"
pointpillar_url = "https://storage.googleapis.com/open3d-releases/model-zoo/pointpillars_kitti_202012221652utc.pth"
if not os.path.exists(ckpt_path):
    cmd = "wget {} -O {}".format(pointpillar_url, ckpt_path)
    os.system(cmd)

# load the parameters.
pipeline.load_ckpt(ckpt_path=ckpt_path)

test_split = dataset.get_split("test")
data = test_split.get_data(0)

# run inference on a single example.
# returns dict with 'predict_labels' and 'predict_scores'.
result = pipeline.run_inference(data)

# evaluate performance on the test set; this will write logs to './logs'.
pipeline.run_test()

Users can also use predefined scripts to load pretrained weights and run testing.

Training a model for 3D object detection

Similar as for inference, pipelines provide an interface for training a model on a dataset.

# use a cache for storing the results of the preprocessing (default path is './logs/cache')
dataset = ml3d.datasets.KITTI(dataset_path='/path/to/KITTI/', use_cache=True)

# create the model with random initialization.
model = PointPillars()

pipeline = ObjectDetection(model=model, dataset=dataset, max_epoch=100)

# prints training progress in the console.
pipeline.run_train()

Below is an example of visualization using KITTI. The example shows the use of bounding boxes for the KITTI dataset.

For more examples see examples/ and the scripts/ directories. You can also enable saving training summaries in the config file and visualize ground truth and results with tensorboard. See this tutorial for details.

Using predefined scripts

scripts/run_pipeline.py provides an easy interface for training and evaluating a model on a dataset. It saves the trouble of defining specific model and passing exact configuration.

python scripts/run_pipeline.py {tf/torch} -c <path-to-config> --pipeline {SemanticSegmentation/ObjectDetection} --<extra args>

You can use script for both semantic segmentation and object detection. You must specify either SemanticSegmentation or ObjectDetection in the pipeline parameter. Note that extra args will be prioritized over the same parameter present in the configuration file. So instead of changing param in config file, you may pass the same as a command line argument while launching the script.

For eg.

# Launch training for RandLANet on SemanticKITTI with torch.
python scripts/run_pipeline.py torch -c ml3d/configs/randlanet_semantickitti.yml --dataset.dataset_path <path-to-dataset> --pipeline SemanticSegmentation --dataset.use_cache True

# Launch testing for PointPillars on KITTI with torch.
python scripts/run_pipeline.py torch -c ml3d/configs/pointpillars_kitti.yml --split test --dataset.dataset_path <path-to-dataset> --pipeline ObjectDetection --dataset.use_cache True

For further help, run python scripts/run_pipeline.py --help.

Repository structure

The core part of Open3D-ML lives in the ml3d subfolder, which is integrated into Open3D in the ml namespace. In addition to the core part, the directories examples and scripts provide supporting scripts for getting started with setting up a training pipeline or running a network on a dataset.

β”œβ”€ docs                   # Markdown and rst files for documentation
β”œβ”€ examples               # Place for example scripts and notebooks
β”œβ”€ ml3d                   # Package root dir that is integrated in open3d
     β”œβ”€ configs           # Model configuration files
     β”œβ”€ datasets          # Generic dataset code; will be integratede as open3d.ml.{tf,torch}.datasets
     β”œβ”€ metrics           # Metrics available for evaluating ML models
     β”œβ”€ utils             # Framework independent utilities; available as open3d.ml.{tf,torch}.utils
     β”œβ”€ vis               # ML specific visualization functions
     β”œβ”€ tf                # Directory for TensorFlow specific code. same structure as ml3d/torch.
     β”‚                    # This will be available as open3d.ml.tf
     β”œβ”€ torch             # Directory for PyTorch specific code; available as open3d.ml.torch
          β”œβ”€ dataloaders  # Framework specific dataset code, e.g. wrappers that can make use of the
          β”‚               # generic dataset code.
          β”œβ”€ models       # Code for models
          β”œβ”€ modules      # Smaller modules, e.g., metrics and losses
          β”œβ”€ pipelines    # Pipelines for tasks like semantic segmentation
          β”œβ”€ utils        # Utilities for <>
β”œβ”€ scripts                # Demo scripts for training and dataset download scripts

Tasks and Algorithms

Semantic Segmentation

For the task of semantic segmentation, we measure the performance of different methods using the mean intersection-over-union (mIoU) over all classes. The table shows the available models and datasets for the segmentation task and the respective scores. Each score links to the respective weight file.

Model / Dataset SemanticKITTI Toronto 3D S3DIS Semantic3D Paris-Lille3D ScanNet
RandLA-Net (tf) 53.7 73.7 70.9 76.0 70.0* -
RandLA-Net (torch) 52.8 74.0 70.9 76.0 70.0* -
KPConv (tf) 58.7 65.6 65.0 - 76.7 -
KPConv (torch) 58.0 65.6 60.0 - 76.7 -
SparseConvUnet (torch) - - - - - 68
SparseConvUnet (tf) - - - - - 68.2
PointTransformer (torch) - - 69.2 - - -
PointTransformer (tf) - - 69.2 - - -

(*) Using weights from original author.

Object Detection

For the task of object detection, we measure the performance of different methods using the mean average precision (mAP) for bird's eye view (BEV) and 3D. The table shows the available models and datasets for the object detection task and the respective scores. Each score links to the respective weight file. For the evaluation, the models were evaluated using the validation subset, according to KITTI's validation criteria. The models were trained for three classes (car, pedestrian and cyclist). The calculated values are the mean value over the mAP of all classes for all difficulty levels. For the Waymo dataset, the models were trained on three classes (pedestrian, vehicle, cyclist).

Model / Dataset KITTI [BEV / 3D] @ 0.70 Waymo (BEV / 3D) @ 0.50
PointPillars (tf) 61.6 / 55.2 -
PointPillars (torch) 61.2 / 52.8 avg: 61.01 / 48.30 | best: 61.47 / 57.55 1
PointRCNN (tf) 78.2 / 65.9 -
PointRCNN (torch) 78.2 / 65.9 -

Training PointRCNN

To use ground truth sampling data augmentation for training, we can generate the ground truth database as follows:

python scripts/collect_bboxes.py --dataset_path <path_to_data_root>

This will generate a database consisting of objects from the train split. It is recommended to use this augmentation for dataset like KITTI where objects are sparse.

The two stages of PointRCNN are trained separately. To train the proposal generation stage of PointRCNN with PyTorch, run the following command:

# Train RPN for 100 epochs.
python scripts/run_pipeline.py torch -c ml3d/configs/pointrcnn_kitti.yml --dataset.dataset_path <path-to-dataset> --mode RPN --epochs 100

After getting a well trained RPN network, we can train RCNN network with frozen RPN weights.

# Train RCNN for 70 epochs.
python scripts/run_pipeline.py torch -c ml3d/configs/pointrcnn_kitti.yml --dataset.dataset_path <path-to-dataset> --mode RCNN --model.ckpt_path <path_to_checkpoint> --epochs 100

Model Zoo

For a full list of all weight files see model_weights.txt and the MD5 checksum file model_weights.md5.

Datasets

The following is a list of datasets for which we provide dataset reader classes.

For downloading these datasets visit the respective webpages and have a look at the scripts in scripts/download_datasets.

How-tos

Contribute

There are many ways to contribute to this project. You can:

  • Implement a new model
  • Add code for reading a new dataset
  • Share parameters and weights for an existing model
  • Report problems and bugs

Please, make your pull requests to the dev branch. Open3D is a community effort. We welcome and celebrate contributions from the community!

If you want to share weights for a model you trained please attach or link the weights file in the pull request. For bugs and problems, open an issue. Please also check out our communication channels to get in contact with the community.

Communication channels

  • Forum: discussion on the usage of Open3D.
  • Discord Chat: online chats, discussions, and collaboration with other users and developers.

Citation

Please cite our work (pdf) if you use Open3D.

@article{Zhou2018,
    author    = {Qian-Yi Zhou and Jaesik Park and Vladlen Koltun},
    title     = {{Open3D}: {A} Modern Library for {3D} Data Processing},
    journal   = {arXiv:1801.09847},
    year      = {2018},
}

Footnotes

  1. The avg. metrics are the average of three sets of training runs with 4, 8, 16 and 32 GPUs. Training was for halted after 30 epochs. Model checkpoint is available for the best training run. ↩

More Repositories

1

Open3D

Open3D: A Modern Library for 3D Data Processing
C++
10,396
star
2

MiDaS

Code for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
Python
4,041
star
3

OpenBot

OpenBot leverages smartphones as brains for low-cost robots. We have designed a small electric vehicle that costs about $50 and serves as a robot body. Our software stack for Android smartphones supports advanced robotics workloads such as person following and real-time autonomous navigation.
Swift
2,679
star
4

DPT

Dense Prediction Transformers
Python
1,794
star
5

ZoeDepth

Metric depth estimation from a single image
Jupyter Notebook
1,750
star
6

PhotorealismEnhancement

Code & Data for Enhancing Photorealism Enhancement
Python
1,237
star
7

MultiObjectiveOptimization

Source code for Neural Information Processing Systems (NeurIPS) 2018 paper "Multi-Task Learning as Multi-Objective Optimization"
Python
753
star
8

lang-seg

Language-Driven Semantic Segmentation
Jupyter Notebook
654
star
9

FastGlobalRegistration

Fast Global Registration
C++
489
star
10

Open3D-PointNet2-Semantic3D

Semantic3D segmentation with Open3D and PointNet++
Python
461
star
11

FreeViewSynthesis

Code repository for "Free View Synthesis", ECCV 2020.
Python
262
star
12

StableViewSynthesis

Python
212
star
13

DeepLagrangianFluids

Code repository for "Lagrangian Fluid Simulation with Continuous Convolutions", ICLR 2020.
Python
187
star
14

spear

SPEAR: A Simulator for Photorealistic Embodied AI Research
C++
173
star
15

DirectFuturePrediction

Code for the paper "Learning to Act by Predicting the Future", Alexey Dosovitskiy and Vladlen Koltun, ICLR 2017
Python
152
star
16

VI-Depth

Code for Monocular Visual-Inertial Depth Estimation (ICRA 2023)
Python
139
star
17

NPHard

Combinatorial Optimization with Graph Convolutional Networks and Guided Tree Search
Python
139
star
18

redwood-3dscan

Python
100
star
19

Intseg

Interactive Image Segmentation with Latent Diversity
Python
78
star
20

TanksAndTemples

Toolbox for the TanksAndTemples benchmark website
Python
58
star
21

dcflow

Code for the paper "Accurate Optical Flow via Direct Cost Volume Processing. Jia Xu, RenΓ© Ranftl, and Vladlen Koltun. CVPR 2017"
C++
52
star
22

adaptive-surface-reconstruction

Adaptive Surface Reconstruction for 3D Data Processing
Python
48
star
23

DFE

Python
43
star
24

open3d-cmake-find-package

Find pre-installed Open3D package in CMake
C++
42
star
25

vision-for-action

Code to accompany "Does computer vision matter for action?"
Python
41
star
26

LMRS

Source code for ICLR 2020 paper: "Learning to Guide Random Search"
Python
39
star
27

open3d_downloads

Hosting Open3D test data for development use
23
star
28

Open3D-3rdparty

C
20
star
29

open3d-cmake-external-project

Use Open3D as a CMake external project
CMake
15
star
30

0shot-object-insertion

Simulation and robot code for contact-rich household object insertion (ICRA 2023).
Python
11
star
31

objects-with-lighting

8
star
32

Open3D-Viewer

C++
7
star
33

generalized-smoothing

Companion code for the ICML 2022 paper "Generalizing Gaussian Smoothing for Random Search"
Python
5
star
34

Open3D-Python-CI

Testing Open3D Python package from PyPI and Conda
4
star
35

MetaLearningTradeoffs

Source code for the NeurIPS 2020 Paper: Modeling and Optimization Trade-off in Meta-learning.
Python
4
star
36

hello-world-docker-action

Dockerfile
1
star
37

mshadow

Forked from https://github.com/dmlc/mshadow
C++
1
star