An implementation of DetNet: A Backbone network for Object Detection. Due to the short time, I only trained and tested on pascal voc dataset. It proved that the performance of detnet59 is indeed better than FPN101.
Introduction
Firstly, I spent about one week training detnet59 on the ImageNet dataset .The classification performance of detnet59 is a little better than the original resnet50. Then i used the pretrained detnet59 to train and test on pascal voc.
Based on FPN_Pytorch, i change FPN101 to detnet59.
Update 2019/01/01
Fix bugs in demo.py. Now you can run demo.py. Note the default demo.py merely support pascal_voc categories. You need to change the pascal_classes
in demo.py to adapt your own dataset. If you want to know more details, please see the usage part.
Update 2018/8/21
train and test on COCO2017 !
Update
Adding soft_nms. Without requiring any re-training of existing models. You only need to use soft_nms during testing to bring performance improvements.
Benchmarking
I benchmark this code thoroughly on pascal voc2007 and 07+12. Below are the results:
0). ImageNet(test on validation dataset)
backbone | Top1 error |
---|---|
pytorch resnet50 | 23.9 |
detnet59 in this code | 23.8 |
detnet59 in the original paper | 23.5 |
1). PASCAL VOC 2007 (Train/Test: 07trainval/07test, scale=600, ROI Align)
model(FPN) | GPUs | Batch Size | lr | lr_decay | max_epoch | Speed/epoch | Memory/GPU | mAP |
---|---|---|---|---|---|---|---|---|
ResNet-101 | 1 GTX 1080 (Ti) | 2 | 1e-3 | 10 | 12 | 1.44hr | 6137MB | 75.7 |
DetNet59 | 1 GTX 1080 (Ti) | 2 | 1e-3 | 10 | 12 | 1.07hr | 5412MB | 75.9 |
2). PASCAL VOC 07+12 (Train/Test: 07+12trainval/07test, scale=600, ROI Align)
model(FPN) | GPUs | Batch Size | lr | lr_decay | max_epoch | Speed/epoch | Memory/GPU | mAP |
---|---|---|---|---|---|---|---|---|
ResNet-101 | 1 GTX 1080 (Ti) | 1 | 1e-3 | 10 | 12 | 3.96hr | 9011MB | 80.5 |
DetNet59 | 1 GTX 1080 (Ti) | 1 | 1e-3 | 10 | 12 | 2.33hr | 8015MB | 80.7 |
ResNet-101(using soft_nms when testing) | 1 GTX 1080 (Ti) | \ | \ | \ | \ | \ | \ | 81.2 |
DetNet59(using soft_nms when testing) | 1 GTX 1080 (Ti) | \ | \ | \ | \ | \ | \ | 81.6 |
3). COCO2017 (Train/Test:COCO2017train/COCO2017val, scale=800, max_size=1200,ROI Align)
model | #GPUs | batch size | lr | lr_decay | max_epoch | time/epoch | mem/GPU | mAP |
---|---|---|---|---|---|---|---|---|
DetNet59 | 2 | 4 | 4e-3 | 4 | 11 | \ | 9000 | 36.0 |
Preparation
First of all, clone the code
git clone https://github.com/guoruoqian/DetNet_Pytorch.git
Then, create a folder:
cd DetNet_Pytorch && mkdir data
prerequisites
- Python 2.7 or 3.6
- Pytorch 0.2.0 or higher(not support pytorch version >=0.4.0)
- CUDA 8.0 or higher
- tensorboardX
Data Preparation
- VOC2007: Please follow the instructions in py-faster-rcnn to prepare VOC datasets. Actually, you can refer to any others. After downloading the data, creat softlinks in the folder data/.
- VOC 07 + 12: Please follow the instructions in YuwenXiong/py-R-FCN . I think this instruction is more helpful to prepare VOC datasets.
Pretrained Model
 You can download the detnet59 model which i trained on ImageNet from:
Download it and put it into the data/pretrained_model/.
Compilation
As pointed out by ruotianluo/pytorch-faster-rcnn, choose the right -arch
 in make.sh
 file, to compile the cuda code:
GPU model | Architecture |
---|---|
TitanX (Maxwell/Pascal) | sm_52 |
GTX 960M | sm_50 |
GTX 1080 (Ti) | sm_61 |
Grid K520 (AWS g2.2xlarge) | sm_30 |
Tesla K80 (AWS p2.xlarge) | sm_37 |
Install all the python dependencies using pip:
pip install -r requirements.txt
Compile the cuda dependencies using following simple commands:
cd lib
sh make.sh
It will compile all the modules you need, including NMS, ROI_Pooing, ROI_Align and ROI_Crop. The default version is compiled with Python 2.7, please compile by yourself if you are using a different python version.
Usage
train voc2007:
CUDA_VISIBLE_DEVICES=3 python3 trainval_net.py exp_name --dataset pascal_voc --net detnet59 --bs 2 --nw 4 --lr 1e-3 --epochs 12 --save_dir weights --cuda --use_tfboard True
test voc2007:
CUDA_VISIBLE_DEVICES=3 python3 test_net.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights
run demo.py :
Before run demo, you must make dictionary 'demo_images' and put images (VOC images) in it. You can download the pretrained model  listed in above tables.
CUDA_VISIBLE_DEVICES=0 python3 demo.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights --image_dir demo_images --result_dir vis_results
using soft_nms when testing:
CUDA_VISIBLE_DEVICES=3 python3 test_net.py exp_name --dataset pascal_voc --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 5010 --cuda --load_dir weights --soft_nms
Before training voc07+12, you can must set ASPECT_CROPPING in detnet59.yml False, or you will encounter some error during the training.
train voc07+12:
CUDA_VISIBLE_DEVICES=3 python3 trainval_net.py exp_name2 --dataset pascal_voc_0712 --net detnet59 --bs 1 --nw 4 --lr 1e-3 --epochs 12 --save_dir weights --cuda --use_tfboard True
train coco:
CUDA_VISIBLE_DEVICES=6,7 python3 trainval_net.py detnetv1.0 --dataset coco --net detnet59 --bs 4 --nw 4 --lr 4e-3 --epochs 12 --save_dir weights --cuda --lscale --mGPUs
test coco:
CUDA_VISIBLE_DEVICES=2 python3 test_net.py detnetv1.0 --dataset coco --net detnet59 --checksession 1 --checkepoch 7 --checkpoint 58632 --cuda --load_dir weights --ls
TODO
- Train and test on COCO(Done)