Please note that this is currently a pre-release version, several refactors will be made in the near future, which include removing the
nerfstudio/
and adapting to PyTorch 2.0 & nerfstudio 0.3.x, etc. After that, this version will no longer be maintained.
1. Installation: Setup the environment
Prerequisites
You must have an NVIDIA video card with CUDA installed on the system. This library has been tested with version 11.3/11.7 of CUDA. You can find more information about installing CUDA here.
Create environment
Nerfstudio requires python >= 3.7
. We recommend using conda to manage dependencies. Make sure to install Conda before proceeding.
conda create --name nerfstudio -y python=3.8
conda activate nerfstudio
python -m pip install --upgrade pip
pip install --upgrade pip setuptools
Installation
This section will walk you through the installation process. Our system is dependent on the nerfstudio project.
- Install tiny-cuda-nn firstly.
- Install MARS locally with:
git clone [email protected]:OPEN-AIR-SUN/mars.git
cd mars/nerfstudio
pip install -e .[dev] # install nerfstudio and its dependencies
cd ..
pip install -e .
ns-install-cli # optional, only for tab completion
2. Training from Scratch
The following will train a MARS model.
Our repository provides dataparser for KITTI and vKITTI2 datasets, for your own data, you can write your own dataparser or convert your own dataset to the format of the provided datasets.
From Datasets
Data Preparation
The data used in our experiments should contain both the pose parameters of cameras and object tracklets. The camera parameters include the intrinsics and the extrinsic. The object tracklets include the bounding box poses, types, ids, etc. For more information, you can refer to KITTI-MOT or vKITTI2 datasets below.
KITTI
The KITTI-MOT dataset should look like this:
.(KITTI_MOT_ROOT)
├── panoptic_maps # (Optional) panoptic segmentation from KITTI-STEP dataset.
│ ├── colors
│ │ └── sequence_id.txt
│ ├── train
│ │ └── sequence_id
│ │ └── frame_id.png
└── training
├── calib
│ └── sequence_id.txt
├── completion_02 # (Optional) depth completion
│ └── sequence_id
│ └── frame_id.png
├── completion_03
│ └── sequence_id
│ └── frame_id.png
├── image_02
│ └── sequence_id
│ └── frame_id.png
├── image_03
│ └── sequence_id
│ └── frame_id.png
├── label_02
│ └── sequence_id.txt
└── oxts
└── sequence_id.txt
We download the KITTI-STEP annotations and generate the panoptic segmentation maps for KITTI-MOT dataset. You can download the demo panoptic maps here and put them in the
KITTI-MOT
directory, or you can visit the official website of KITTI-STEP for more information.
To train a reconstruction model, you can use the following command:
ns-train nsg-kitti-car-depth-recon --data /data/kitti-MOT/training/image_02/0006
or if you want to use the Python script (please refer to the launch.json
file in the .vscode
directory):
python nerfstudio/nerfstudio/scripts/train.py nsg-kitti-car-depth-recon --data /data/kitti-MOT/training/image_02/0006
vKITTI2
The vKITTI2 dataset should look like this:
.(vKITTI2_ROOT)
└── sequence_id
└── scene_name
├── bbox.txt
├── colors.txt
├── extrinsic.txt
├── info.txt
├── instrinsic.txt
├── pose.txt
└── frames
├── depth
│ ├── Camera_0
│ │ └── frame_id.png
│ └── Camera_1
│ │ └── frame_id.png
├── instanceSegmentation
│ ├── Camera_0
│ │ └── frame_id.png
│ └── Camera_1
│ │ └── frame_id.png
├── classSegmentation
│ ├── Camera_0
│ │ └── frame_id.png
│ └── Camera_1
│ │ └── frame_id.png
└── rgb
├── Camera_0
│ └── frame_id.png
└── Camera_1
└── frame_id.png
To train a reconstruction model, you can use the following command:
ns-train nsg-vkitti-car-depth-recon --data /data/vkitti/Scene06/clone
or if you want to use the python script:
python nerfstudio/nerfstudio/scripts/train.py nsg-vkitti-car-depth-recon --data /data/vkitti/Scene06/clone
Your Own Data
For your own data, you can refer to the above data structure and write your own dataparser, or you can convert your own dataset to the format of the dataset above.
From Pre-Trained Model
Our model uses nerfstudio as the training framework, we provide the reconstruction and novel view synthesis tasks checkpoints.
Our pre-trained model is uploaded to Google Drive, you can refer to the below table to download the model.
Dataset | Scene | First Frame | Last Frame | Setting | PSNR | SSIM | Download |
---|---|---|---|---|---|---|---|
KITTI-MOT | 0006 | 5 | 260 | Reconstruction | 29.06 | 0.885 | model |
0006 | 5 | 260 | Novel View Synthesis 75% | 24.23 | 0.845 | model | |
0006 | 5 | 260 | Novel View Synthesis 50% | 24.00 | 0.801 | model | |
0006 | 5 | 260 | Novel View Synthesis 25% | 23.23 | 0.756 | model | |
Vitural KITTI-2 | Scene06 | 0 | 237 | Novel View Synthesis 75% | 29.79 | 0.917 | model |
Scene06 | 0 | 237 | Novel View Synthesis 50% | 29.63 | 0.916 | model | |
Scene06 | 0 | 237 | Novel View Synthesis 25% | 27.01 | 0.887 | model |
You can use the following command to train a model from a pre-trained model:
ns-train nsg-kitti-car-depth-recon --data /data/kitti-MOT/training/image_02/0006 --load-dir outputs/experiment_name/method_name/timestamp/nerfstudio
Model Configs
Our modular framework supports combining different architectures for each node by modifying model configurations. Here's an example of using Nerfacto for background and our category-level object model:
model=SceneGraphModelConfig(
background_model=NerfactoModelConfig(),
object_model_template=CarNeRFModelConfig(_target=CarNeRF),
object_representation="class-wise",
object_ray_sample_strategy="remove-bg",
)
If you choose to use the category-level object model, please make sure that the
use_car_latents=True
and the latent codes exists. We provide latent codes of some sequences on KITTI-MOT and vKITTI2 datasets here.
For more information, please refer to our provided configurations at nsg/cicai_configs.py
. We use wandb for logging by default, you can also specify other viewers (tensorboard/nerfstudio-viewer supported) with the --vis
config. Please refer to the nerfstudio documentation for details.
Render
If you want to render with our pre-trained model, you should visit here to download our checkpoints and config. To run the render script, you need to ensure that your config is the same as the config.yml
that you load in.
You can use the following command to render:
python scripts/cicai_render.py --load-config outputs/nvs75fullseq/nsg-vkitti-car-depth-nvs/2023-06-21_135412/config.yml --output-path renders/
Citation
You can find our paper here. If you use this library or find the repo useful for your research, please consider citing:
@article{wu2023mars,
author = {Wu, Zirui and Liu, Tianyu and Luo, Liyi and Zhong, Zhide and Chen, Jianteng and Xiao, Hongmin and Hou, Chao and Lou, Haozhe and Chen, Yuantao and Yang, Runyi and Huang, Yuxin and Ye, Xiaoyu and Yan, Zike and Shi, Yongliang and Liao, Yiyi and Zhao, Hao},
title = {MARS: An Instance-aware, Modular and Realistic Simulator for Autonomous Driving},
journal = {CICAI},
year = {2023},
}
Acknoledgement
Part of our code is borrowed from Nerfstudio. This project is sponsored by Tsinghua-Toyota Joint Research Fund (20223930097) and Baidu Inc. through Apollo-AIR Joint Research Center.
Notice
This open-sourced version will be actively maintained and regularly updated. For more features, please contact us for a commercial version.