• Stars
    star
    289
  • Rank 143,419 (Top 3 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 4 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud (ECCV 2018)

YOLO3D-YOLOv4-PyTorch

python-image pytorch-image

The PyTorch Implementation based on YOLOv4 of the paper: YOLO3D: End-to-end real-time 3D Oriented Object Bounding Box Detection from LiDAR Point Cloud (ECCV 2018)


Demo

demo

  • Inputs: Bird-eye-view (BEV) maps that are encoded by height, intensity and density of 3D LiDAR point clouds.
  • The input size: 608 x 608 x 3
  • Outputs: 7 degrees of freedom (7-DOF) of objects: (cx, cy, cz, l, w, h, ΞΈ)
    • cx, cy, cz: The center coordinates.
    • l, w, h: length, width, height of the bounding box.
    • ΞΈ: The heading angle in radians of the bounding box.
  • Objects: Cars, Pedestrians, Cyclists.

Features

2. Getting Started

2.1. Requirement

pip install -U -r requirements.txt

For mayavi and shapely libraries, please refer to the installation instructions from their official websites.

2.2. Data Preparation

Download the 3D KITTI detection dataset from here.

The downloaded data includes:

  • Velodyne point clouds (29 GB): input data to the YOLO3D model
  • Training labels of object data set (5 MB): input label to the YOLO3D model
  • Camera calibration matrices of object data set (16 MB): for visualization of predictions
  • Left color images of object data set (12 GB): for visualization of predictions

Please make sure that you construct the source code & dataset directories structure as below.

2.3. YOLOv4 architecture

architecture

This work has been based on the paper YOLOv4: Optimal Speed and Accuracy of Object Detection.

List of usage for Bag of Freebies (BoF) & Bag of Specials (BoS) in this implementation

Backbone Detector
BoF [x] Dropblock
[x] Random rescale, rotation (global)
[x] Mosaic/Cutout augmentation
[x] Cross mini-Batch Normalization
[x] Dropblock
[x] Random training shapes
BoS [x] Mish activation
[x] Cross-stage partial connections (CSP)
[x] Multi-input weighted residual connections (MiWRC)
[x] Mish activation
[x] SPP-block
[x] SAM-block
[x] PAN path-aggregation block

2.4. How to run

2.4.1. Visualize the dataset (both BEV images from LiDAR and camera images)

cd src/data_process
  • To visualize BEV maps and camera images (with 3D boxes), let's execute (the output-width param can be changed to show the images in a bigger/smaller window):
python kitti_dataloader.py --output-width 608
  • To visualize the cutout augmentation, let's execute:
python kitti_dataloader.py --show-train-data --cutout_prob 1. --cutout_nholes 1 --cutout_fill_value 1. --cutout_ratio 0.3 --output-width 608

2.4.2. Inference

Download the trained model from here, then put it to ${ROOT}/checkpoints/ and execute:

python test.py --gpu_idx 0 --pretrained_path ../checkpoints/yolo3d_yolov4.pth --cfgfile ./config/cfg/yolo3d_yolov4.cfg 

2.4.3. Evaluation

python evaluate.py --gpu_idx 0 --pretrained_path <PATH> --cfgfile <CFG> --img_size <SIZE> --conf-thresh <THRESH> --nms-thresh <THRESH> --iou-thresh <THRESH>

(The conf-thresh, nms-thresh, and iou-thresh params can be adjusted. By default, these params have been set to 0.5)

2.4.4. Training

2.4.4.1. Single machine, single gpu
python train.py --gpu_idx 0 --batch_size <N> --num_workers <N>...
2.4.4.2. Multi-processing Distributed Data Parallel Training

We should always use the nccl backend for multi-processing distributed training since it currently provides the best distributed training performance.

  • Single machine (node), multiple GPUs
python train.py --dist-url 'tcp://127.0.0.1:29500' --dist-backend 'nccl' --multiprocessing-distributed --world-size 1 --rank 0
  • Two machines (two nodes), multiple GPUs

First machine

python train.py --dist-url 'tcp://IP_OF_NODE1:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 0

Second machine

python train.py --dist-url 'tcp://IP_OF_NODE2:FREEPORT' --dist-backend 'nccl' --multiprocessing-distributed --world-size 2 --rank 1

To reproduce the results, you can run the bash shell script

./train.sh

Tensorboard

  • To track the training progress, go to the logs/ folder and
cd logs/<saved_fn>/tensorboard/
tensorboard --logdir=./

Contact

If you think this work is useful, please give me a star!
If you find any errors or have any suggestions, please contact me (Email: [email protected]).
Thank you!

Citation

@article{YOLOv4,
  author = {Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao},
  title = {YOLOv4: Optimal Speed and Accuracy of Object Detection},
  year = {2020},
  journal = {arXiv},
}
@article{YOLO3D,
  author = {Waleed Ali, Sherif Abdelkarim, Mohamed Zahran,  Mahmoud Zidan, Ahmad El Sallab},
  title = {YOLO3D: End-to-end real-time 3d oriented object bounding box detection from lidar point cloud},
  year = {2018},
  conference = {ECCV 2018},
}
@misc{YOLO3D-YOLOv4-PyTorch,
  author =       {Nguyen Mau Dung},
  title =        {{YOLO3D-YOLOv4-PyTorch: PyTorch Implementation of based on YOLOv4 of YOLO3D paper}},
  howpublished = {\url{https://github.com/maudzung/YOLO3D-YOLOv4-PyTorch}},
  year =         {2020}
}

Folder structure

${ROOT}
└── checkpoints/    
β”‚   β”œβ”€β”€ yolo3d_yolov4.pth
└── dataset/    
β”‚   └── kitti/
β”‚   β”‚   β”œβ”€β”€ImageSets/
β”‚   β”‚   β”‚   β”œβ”€β”€ test.txt
β”‚   β”‚   β”‚   β”œβ”€β”€ train.txt
β”‚   β”‚   β”‚   └── val.txt
β”‚   β”‚   β”œβ”€β”€ training/
β”‚   β”‚   β”‚   β”œβ”€β”€ image_2/ <-- for visualization
β”‚   β”‚   β”‚   β”œβ”€β”€ calib/
β”‚   β”‚   β”‚   β”œβ”€β”€ label_2/
β”‚   β”‚   β”‚   └── velodyne/
β”‚   β”‚   └── testing/  
β”‚   β”‚   β”‚   β”œβ”€β”€ image_2/ <-- for visualization
β”‚   β”‚   β”‚   β”œβ”€β”€ calib/
β”‚   β”‚   β”‚   └── velodyne/ 
β”‚   β”‚   └── classes_names.txt
└── src/
β”‚   β”œβ”€β”€ config/
β”‚   β”‚   β”œβ”€β”€ cfg/
β”‚   β”‚   β”‚    β”œβ”€β”€ yolo3d_yolov4.cfg
β”‚   β”‚   β”‚    β”œβ”€β”€ yolo3d_yolov4_tiny.cfg
β”‚   β”‚   β”œβ”€β”€ train_config.py
β”‚   β”‚   └── kitti_config.py
β”‚   β”œβ”€β”€ data_process/
β”‚   β”‚   β”œβ”€β”€ kitti_bev_utils.py
β”‚   β”‚   β”œβ”€β”€ kitti_dataloader.py
β”‚   β”‚   β”œβ”€β”€ kitti_dataset.py
β”‚   β”‚   β”œβ”€β”€ kitti_data_utils.py
β”‚   β”‚   └── transformation.py
β”‚   β”œβ”€β”€ models/
β”‚   β”‚   β”œβ”€β”€ darknet2pytorch.py
β”‚   β”‚   β”œβ”€β”€ darknet_utils.py
β”‚   β”‚   β”œβ”€β”€ model_utils.py
β”‚   β”‚   β”œβ”€β”€ yolo_layer.py
β”‚   └── utils/
β”‚   β”‚   β”œβ”€β”€ evaluation_utils.py
β”‚   β”‚   β”œβ”€β”€ iou_utils.py
β”‚   β”‚   β”œβ”€β”€ logger.py
β”‚   β”‚   β”œβ”€β”€ misc.py
β”‚   β”‚   β”œβ”€β”€ torch_utils.py
β”‚   β”‚   β”œβ”€β”€ train_utils.py
β”‚   β”‚   └── visualization_utils.py
β”‚   β”œβ”€β”€ evaluate.py
β”‚   β”œβ”€β”€ test.py
β”‚   β”œβ”€β”€ test.sh
β”‚   β”œβ”€β”€ train.py
β”‚   └── train.sh
β”œβ”€β”€ README.md 
└── requirements.txt

Usage

python train.py --help

usage: train.py [-h] [--seed SEED] [--saved_fn FN] [--root-dir PATH]
                [-a ARCH] [--cfgfile PATH] [--pretrained_path PATH]
                [--use_giou_loss] [--img_size IMG_SIZE]
                [--hflip_prob HFLIP_PROB] [--cutout_prob CUTOUT_PROB]
                [--cutout_nholes CUTOUT_NHOLES] [--cutout_ratio CUTOUT_RATIO]
                [--cutout_fill_value CUTOUT_FILL_VALUE]
                [--multiscale_training] [--mosaic] [--random-padding]
                [--no-val] [--num_samples NUM_SAMPLES]
                [--num_workers NUM_WORKERS] [--batch_size BATCH_SIZE]
                [--print_freq N] [--tensorboard_freq N] [--checkpoint_freq N]
                [--start_epoch N] [--num_epochs N] [--lr_type LR_TYPE]
                [--lr LR] [--minimum_lr MIN_LR] [--momentum M] [-wd WD]
                [--optimizer_type OPTIMIZER] [--burn_in N]
                [--steps [STEPS [STEPS ...]]] [--world-size N] [--rank N]
                [--dist-url DIST_URL] [--dist-backend DIST_BACKEND]
                [--gpu_idx GPU_IDX] [--no_cuda]
                [--multiprocessing-distributed] [--evaluate]
                [--resume_path PATH] [--conf-thresh CONF_THRESH]
                [--nms-thresh NMS_THRESH] [--iou-thresh IOU_THRESH]

The Implementation of YOLO3D-YOLOv4 using PyTorch

optional arguments:
  -h, --help            show this help message and exit
  --seed SEED           re-produce the results with seed random
  --saved_fn FN         The name using for saving logs, models,...
  --root-dir PATH    The ROOT working directory
  -a ARCH, --arch ARCH  The name of the model architecture
  --cfgfile PATH        The path for cfgfile (only for darknet)
  --pretrained_path PATH
                        the path of the pretrained checkpoint
  --use_giou_loss       If true, use GIoU loss during training. If false, use
                        MSE loss for training
  --img_size IMG_SIZE   the size of input image
  --hflip_prob HFLIP_PROB
                        The probability of horizontal flip
  --cutout_prob CUTOUT_PROB
                        The probability of cutout augmentation
  --cutout_nholes CUTOUT_NHOLES
                        The number of cutout area
  --cutout_ratio CUTOUT_RATIO
                        The max ratio of the cutout area
  --cutout_fill_value CUTOUT_FILL_VALUE
                        The fill value in the cut out area, default 0. (black)
  --multiscale_training
                        If true, use scaling data for training
  --mosaic              If true, compose training samples as mosaics
  --random-padding      If true, random padding if using mosaic augmentation
  --no-val              If true, dont evaluate the model on the val set
  --num_samples NUM_SAMPLES
                        Take a subset of the dataset to run and debug
  --num_workers NUM_WORKERS
                        Number of threads for loading data
  --batch_size BATCH_SIZE
                        mini-batch size (default: 4), this is the totalbatch
                        size of all GPUs on the current node when usingData
                        Parallel or Distributed Data Parallel
  --print_freq N        print frequency (default: 50)
  --tensorboard_freq N  frequency of saving tensorboard (default: 20)
  --checkpoint_freq N   frequency of saving checkpoints (default: 2)
  --start_epoch N       the starting epoch
  --num_epochs N        number of total epochs to run
  --lr_type LR_TYPE     the type of learning rate scheduler (cosin or
                        multi_step)
  --lr LR               initial learning rate
  --minimum_lr MIN_LR   minimum learning rate during training
  --momentum M          momentum
  -wd WD, --weight_decay WD
                        weight decay (default: 1e-6)
  --optimizer_type OPTIMIZER
                        the type of optimizer, it can be sgd or adam
  --burn_in N           number of burn in step
  --steps [STEPS [STEPS ...]]
                        number of burn in step
  --world-size N        number of nodes for distributed training
  --rank N              node rank for distributed training
  --dist-url DIST_URL   url used to set up distributed training
  --dist-backend DIST_BACKEND
                        distributed backend
  --gpu_idx GPU_IDX     GPU index to use.
  --no_cuda             If true, cuda is not used.
  --multiprocessing-distributed
                        Use multi-processing distributed training to launch N
                        processes per node, which has N GPUs. This is the
                        fastest way to use PyTorch for either single node or
                        multi node data parallel training
  --evaluate            only evaluate the model, not training
  --resume_path PATH    the path of the resumed checkpoint
  --conf-thresh CONF_THRESH
                        for evaluation - the threshold for class conf
  --nms-thresh NMS_THRESH
                        for evaluation - the threshold for nms
  --iou-thresh IOU_THRESH
                        for evaluation - the threshold for IoU

More Repositories

1

Complex-YOLOv4-Pytorch

The PyTorch Implementation based on YOLOv4 of the paper: "Complex-YOLO: Real-time 3D Object Detection on Point Clouds"
Python
1,241
star
2

SFA3D

Super Fast and Accurate 3D Object Detection based on 3D LiDAR Point Clouds (The PyTorch implementation)
Python
1,021
star
3

TTNet-Real-time-Analysis-System-for-Table-Tennis-Pytorch

Unofficial implementation of "TTNet: Real-time temporal and spatial video analysis of table tennis" (CVPR 2020)
Python
596
star
4

RTM3D

Unofficial PyTorch implementation of "RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving" (ECCV 2020)
Python
291
star
5

Awesome-Autonomous-Driving-Papers

This repository provides awesome research papers for autonomous driving perception. If you do find a problem or have any suggestions, please raise this as an issue or make a pull request with information (format of the repo): Research paper title, datasets, metrics, objects, source code, publisher, and year.
79
star
6

CenterNet3D-PyTorch

Unofficial PyTorch implementation of the paper: "CenterNet3D: An Anchor free Object Detector for Autonomous Driving"
Python
71
star
7

human36m_preprocessing

This instruction will help you to pre-process the Human3.6M dataset
Python
17
star
8

Self-Driving-Car-09-Programing-A-Real-Car

The safe navigation for a self-driving car around a course using the Robot Operative System (ROS) framework
CMake
16
star
9

virtual_environment_python3

The instruction to setup a virtual environment
Shell
10
star
10

3D-Human-Pose-Estimation

An implementation of 3D human pose estimation based on 2D keypoints in images
Jupyter Notebook
9
star
11

Self-Driving-Car-02-Advance-Finding-Lanelines

Advanced Lane Finding Project
Python
5
star
12

maudzung

4
star
13

Self-Driving-Car-08-PID-Control-CPP

An implementation of a PID controller that determines the steering angle in order to keep a car in the center of the lane track during driving.
C++
4
star
14

Self-Driving-Car-07-Path-Planning-CPP

A safe path planner for the car driving on a virtual highway with other vehicles
C++
4
star
15

SFND

Sensor Fusion Nanodegree Program
C++
3
star
16

Extended-Kalman-Filter-CPP

Extended Kalman Filter Project using C++
C++
3
star
17

Self-Driving-Car-04-Behavior-Cloning

Python
2
star
18

Self-Driving-Car-06-Kidnapped-Vehicle-Particle-Filters

An implementation of a 2D particle filter in C++
C++
1
star
19

Self-Driving-Car-01-Finding-Lanelines

Using the Canny Edge Detector and the Hough Transform to find the lane lines in videos captured by a camera placed in the front of a car.
Jupyter Notebook
1
star