High-resolution Networks for FCOS
Introduction
This project contains the code of HRNet-FCOS, i.e., using High-resolution Networks (HRNets) as the backbones for the Fully Convolutional One-Stage Object Detection (FCOS) algorithm, which achieves much better object detection performance compared with the ResNet-FCOS counterparts while keeping a similar computation complexity. For more projects using HRNets, please go to our website.
Quick start
Installation
Please check INSTALL.md for installation instructions. You may also want to see the original README.md of FCOS.
Inference
The inference command line on coco minival split:
python tools/test_net.py \
--config-file configs/fcos/fcos_hrnet_w32_5l_2x.yaml \
MODEL.WEIGHT models/FCOS_hrnet_w32_5l_2x.pth \
TEST.IMS_PER_BATCH 8
Please note that:
- If your model's name is different, please replace
models/FCOS_hrnet_w32_5l_2x.pth
with your own. - If you enounter out-of-memory error, please try to reduce
TEST.IMS_PER_BATCH
to 1. - If you want to evaluate a different model, please change
--config-file
to its config file (in configs/fcos) andMODEL.WEIGHT
to its weights file.
For your convenience, we provide the following trained models.
FCOS Model | Training mem (GB) | Multi-scale training | SyncBN | Testing time / im | # params | GFLOPs | AP (minival) | Link |
---|---|---|---|---|---|---|---|---|
ResNet_50_5l_2x | 29.3 | No | No | 71ms | 32.0M | 190.0 | 37.1 | - |
HRNet_W18_5l_2x | 54.4 | No | No | 72ms | 17.5M | 180.3 | 37.7 | model |
HRNet_W18_5l_2x | 55.0 | Yes | Yes | 72ms | 17.5M | 180.3 | 39.4 | model |
ResNet_50_6l_2x | 58.2 | No | No | 98ms | 32.7M | 529.0 | 37.1 | - |
HRNet_W18_6l_2x | 88.1 | No | No | 106ms | 18.1M | 515.1 | 37.8 | model |
ResNet_101_5l_2x | 44.1 | Yes | No | 74ms | 51.0M | 261.2 | 41.4 | model |
HRNet_W32_5l_2x | 78.9 | Yes | No | 87ms | 37.3M | 273.3 | 41.9 | model |
HRNet_W32_5l_2x | 80.1 | Yes | Yes | 87ms | 37.3M | 273.3 | 42.5 | model |
ResNet_101_6l_2x | 71.0 | Yes | No | 121ms | 51.6M | 601.0 | 41.5 | model |
HRNet_W32_6l_2x | 108.6 | Yes | No | 125ms | 37.9M | 608.0 | 42.1 | model |
HRNet_W32_6l_2x | 109.9 | Yes | Yes | 125ms | 37.9M | 608.0 | 42.9 | model |
HRNet_W40_6l_3x | 128.0 | Yes | No | 142ms | 54.1M | 682.9 | 42.6 | model |
[1] 1x, 2x and 3x mean the model is trained for 90K, 180K and 270k iterations, respectively.
[2] 5l and 6l denote that we use feature pyramid with 5 levels and 6 levels, respectively.
[3] We provide model trained with Synchronous Batch Normalization (SyncBN).
[4] We report total training memory footprint on all GPUs instead of the memory footprint per GPU as in maskrcnn-benchmark.
[5] The inference speed of HRNet can get improved if the branches in the HRNet model can run in parallel.
[6] All results are obtained with a single model and without any test time data augmentation.
Training
The following command line will trains a fcos_hrnet_w32_5l_2x model on 8 GPUs with Synchronous Stochastic Gradient Descent (SGD):
python -m torch.distributed.launch \
--nproc_per_node=8 \
--master_port=$((RANDOM + 10000)) \
tools/train_net.py \
--config-file configs/fcos/fcos_hrnet_w32_5l_2x.yaml \
MODEL.WEIGHT hrnetv2_w32_imagenet_pretrained.pth \
MODEL.SYNCBN False \
DATALOADER.NUM_WORKERS 4 \
OUTPUT_DIR training_dir/fcos_hrnet_w32_5l_2x
Note that:
- If you want to use fewer GPUs, please change
--nproc_per_node
to the number of GPUs. No other settings need to be changed. The total batch size does not depends onnproc_per_node
. If you want to change the total batch size, please changeSOLVER.IMS_PER_BATCH
in configs/fcos/fcos_hrnet_w32_5l_2x.yaml. - If you want to use Synchronous Batch-Normalization (SyncBN), please change
MODEL.SYNCBN
toTrue
. Note that this will lead to ~2x slower training speed when training on mulitple machines. You also need to fix the image padding size when using SyncBN, see here. - The imagenet pre-trained model can be found here.
- The models will be saved into
OUTPUT_DIR
. - If you want to train FCOS on your own dataset, please follow this instruction #54.
Contributing to the project
Any pull requests or issues are welcome.
Citations
Please consider citing the following papers in your publications if the project helps your research.
@article{sun2019deep,
title={Deep High-Resolution Representation Learning for Human Pose Estimation},
author={Sun, Ke and Xiao, Bin and Liu, Dong and Wang, Jingdong},
journal={arXiv preprint arXiv:1902.09212},
year={2019}
}
@article{tian2019fcos,
title = {{FCOS}: Fully Convolutional One-Stage Object Detection},
author = {Tian, Zhi and Shen, Chunhua and Chen, Hao and He, Tong},
journal = {arXiv preprint arXiv:1904.01355},
year = {2019}
}
License
For academic use, this project is licensed under the 2-clause BSD License - see the LICENSE file for details. For commercial use, please contact the authors.