• Stars
    star
    364
  • Rank 117,101 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 4 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Official Repo for Ground-aware Monocular 3D Object Detection for Autonomous Driving / YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection

Visual 3D Detection Package:

This repo aims to provide flexible and reproducible visual 3D detection on KITTI dataset. We expect scripts starting from the current directory, and treat ./visualDet3D as a package that we could modify and test directly instead of a library. Several useful scripts are provided in the main directory for easy usage.

We believe that visual tasks are interconnected, so we make this library extensible to more experiments. The package uses registry to register datasets, models, processing functions and more, allowing easy inserting of new tasks/models while not interfere with the existing ones.

Related Paper:

This repo contains the official implementation of 2021 RAL & ICRA paper Ground-aware Monocular 3D Object Detection for Autonomous Driving. Arxiv Page. Pretrained model can be found at release pages.

@ARTICLE{9327478,
  author={Y. {Liu} and Y. {Yuan} and M. {Liu}},
  journal={IEEE Robotics and Automation Letters}, 
  title={Ground-aware Monocular 3D Object Detection for Autonomous Driving}, 
  year={2021},
  doi={10.1109/LRA.2021.3052442}}

Also the official implementation of 2021 ICRA paper YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection. Pretrained model can be found at release pages.

@inproceedings{liu2021yolostereo3d,
  title={YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection},
  author={Yuxuan Liu and Lujia Wang and Ming, Liu},
  booktitle={2021 International Conference on Robotics and Automation (ICRA)},
  year={2021},
  organization={IEEE}
}

We further incorperate an Unofficial re-implementation of Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training (KM3D) as a reference on how to integrate with other frameworks. (Notice that the codes are from the originally official repo, and we DO NOT guarantee a complete re-implementation).

Update (2021.07.02): We provide an Unofficial re-implementation of Objects are Different: Flexible Monocular 3D Object Detection (MonoFlex) with few additional codes, based on the KM3D structure. Many of the core codes are from original official repo. We did not implement the edge merge operation and the corner loss, but we manage to maintain most of the performance based on the proposed depth fusion methods(validation AP reaches 15%).

Update (2021.12.11): We provide an Unofficial re-implmentation of Digging Into Output Representation For Monocular 3D Object Detection (Digging_M3D) to introduce an simple but important numerical trick to significantly improve the KITTI mAP scores and make a significant change to the KITTI leaderboard. Details can be found in the paper. At the time of the open-source, the paper has not been officially published, and we will keep up with the update of the paper.

Key Features

  • SOTA Performance State of the art result on visual 3D detection.
  • Modular Design Modular design for dataset, network and running pipelines.
  • Support Various Task Compatible with the training and testing of mono/stereo 3D detection and depth prediction.
  • Distributed & Single GPU Support training with multiple GPUs.
  • Installation-Free Setup The setup process only build operations and does not require installation to keep the environment clean.
  • Global Path-based IMDB Do not need data placed inside the folder, convienient for managing data and code separately.

We provide start-up solutions for Mono3D, Stereo3D, Depth Predictions and more (until further publication). We also provide a comprehensive cookbook to make visualDet3D work with other open-source repo to boost development process.

Reference: this repo borrows codes and ideas from retinanet, mmdetection, M3D-RPN, DORN, EdgeNets, det3

Setup

Environment setup.

pip3 install -r requirement.txt

or manually check dependencies.

# build ops (deform convs and iou3d), We will not install operations into the system environment
./make.sh

Start Training

Please check the corresponding task: Mono3D, Stereo3D, Depth Predictions. More demo will be available through contributions and further paper submission.

Config and Path setup.

Please modify the path and other parameters in config/*.py. config/*_example files are templates.

Notice: *_examples are NOT utilized by the code and *.py under /config is ignored by .gitignore.

The content of the selected config file will be recorded in tensorboard at the beginning of training.

important paths to modify in config :

  1. cfg.path.data_path: Path to KITTI training data. We expect calib, image_2, image_3, label_2 being the subfolder (directly unzipping the downloaded zips will be fine)
  2. cfg.path.test_path: Path to KITTI testing data. We expect calib, image_2 being the subfolder.
  3. cfg.path.visualDet3D_path: Path to the "visualDet3D" directorty of the current repo
  4. cfg.path.project_path: Path to the workdirs of the projects (will have temp_outputs, log, checkpoints)

Please check the template's comments and other comments in codes to fully exploit the repo.

Further Info and Bug Issues

  1. Open issues on the repo if you meet troubles or find a bug or have some suggestions.
  2. Email to [email protected]

Other Resources

Related Codes

More Repositories

1

FSNet

Full Scale Monocular Depth Prediction. Official Implementation of "FSNet: Redesign Self-Supervised MonoDepth for Full-Scale Depth Prediction for Autonomous Driving" https://arxiv.org/abs/2304.10719
Python
62
star
2

papers_reading_sharing.github.io

Sites to share deep learning related papers and their digests
HTML
54
star
3

ros2_vision_inference

unified multi-threading inferencing nodes for monocular 3D object detection, depth prediction and semantic segmentation
Python
33
star
4

visionfactory

Open source training framework for vision tasks. Scales up on data and scales up on tasks. Official Implementation for https://arxiv.org/abs/2310.00920
Python
33
star
5

visualDet3D_ros

A naive ROS interface for visualDet3D.
Python
26
star
6

kitti_visualize

Visualize KITTI sequences and object detection benchmark on ROS with full tf support.
Python
18
star
7

nuscenes_visualize

Visualize nuScenes sequences on ROS with full tf support.
Python
15
star
8

kitti360_visualize

Visualize KITTI360 sequences on ROS with full tf support.
Python
10
star
9

monodepth_ros

MonoDepth ROS Node. Used with FSNet https://github.com/Owen-Liuyuxuan/FSNet
Python
8
star
10

Hacking_dbw_mkz

Hacking into Dataspeed ADAS Development Vehicle Kit. Try to open up the semi-open-source simulation.
C++
7
star
11

Owen-Liuyuxuan.github.io

https://owen-liuyuxuan.github.io/
CSS
6
star
12

localPDFSummarizer

a purely simple local attempt for summarizing academic PDF with learning-based tool-box.
Python
5
star
13

ros2_dataset_bridge

ROS2 Python Package for visualizing KITTI/KITTI360/nuScenes datasets in RVIZ2
Python
5
star
14

SRTP_Predicting_Knee_Joint_Angle

C
3
star
15

final_design_ws

Final year project. Simulation with dbw_mkz tool box , Controller design, lane recognition, lane tracking and more
C++
2
star
16

Road-sign-Arrow-Recognition

First time use Tensorflow to recognize simple road sign. Images are classified as 5 classes.
Python
2
star