• Stars
    star
    369
  • Rank 115,686 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[TIP 2022] CBNetV2: A Composite Backbone Network Architecture for Object Detection

CBNet: A Composite Backbone Network Architecture for Object Detection

PWC PWC PWC PWC

By Tingting Liang*, Xiaojie Chu*, Yudong Liu*, Yongtao Wang, Zhi Tang, Wei Chu, Jingdong Chen, Haibin Ling.

This repo is the official implementation of CBNetV2. It is based on mmdetection and Swin Transformer for Object Detection.

Contact us with [email protected], [email protected], [email protected].

Introduction

CBNetV2 achieves strong single-model performance on COCO object detection (60.1 box AP and 52.3 mask AP on test-dev) without extra training data.

teaser

Partial Results and Models

More results and models can be found in model zoo

Faster R-CNN

Backbone Lr Schd box mAP (minival) #params FLOPs config log model
DB-ResNet50 1x 40.8 69M 284G config github github

Mask R-CNN

Backbone Lr Schd box mAP (minival) mask mAP (minival) #params FLOPs config log model
DB-Swin-T 3x 50.2 44.5 76M 357G config github github

Cascade Mask R-CNN (1600x1400)

Backbone Lr Schd box mAP (minival/test-dev) mask mAP (minival/test-dev) #params FLOPs config model
DB-Swin-S 3x 56.3/56.9 48.6/49.1 156M 1016G config github

Improved HTC (1600x1400)

We use ImageNet-22k pretrained checkpoints of Swin-B and Swin-L. Compared to regular HTC, our HTC uses 4conv1fc in bbox head.

Backbone Lr Schd box mAP (minival/test-dev) mask mAP (minival/test-dev) #params FLOPs config model
DB-Swin-B 20e 58.4/58.7 50.7/51.1 235M 1348G config github
DB-Swin-L 1x 59.1/59.4 51.0/51.6 453M 2162G config (test only) github
DB-Swin-L (TTA) 1x 59.6/60.1 51.8/52.3 453M - config (test only) github

TTA denotes test time augmentation.

Notes:

Usage

Installation

Please refer to get_started.md for installation and dataset preparation.

Inference

# single-gpu testing (w/o segm result)
python tools/test.py <CONFIG_FILE> <DET_CHECKPOINT_FILE> --eval bbox 

# multi-gpu testing (w/ segm result)
tools/dist_test.sh <CONFIG_FILE> <DET_CHECKPOINT_FILE> <GPU_NUM> --eval bbox segm

Training

To train a detector with pre-trained models, run:

# multi-gpu training
tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> 

For example, to train a Faster R-CNN model with a Duel-ResNet50 backbone and 8 gpus, run:

# path of pre-training model (resnet50) is already in config
tools/dist_train.sh configs/cbnet/faster_rcnn_cbv2d1_r50_fpn_1x_coco.py 8 

Another example, to train a Mask R-CNN model with a Duel-Swin-T backbone and 8 gpus, run:

tools/dist_train.sh configs/cbnet/mask_rcnn_cbv2_swin_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco.py 8 --cfg-options model.pretrained=<PRETRAIN_MODEL> 

Apex (optional):

Following Swin Transformer for Object Detection, we use apex for mixed precision training by default. To install apex, run:

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Documents and Tutorials

We list some documents and tutorials from MMDetection, which may be helpful to you.

Citation

If you use our code/model, please consider to cite our paper CBNet: A Composite Backbone Network Architecture for Object Detection.

@ARTICLE{9932281,
  author={Liang, Tingting and Chu, Xiaojie and Liu, Yudong and Wang, Yongtao and Tang, Zhi and Chu, Wei and Chen, Jingdong and Ling, Haibin},
  journal={IEEE Transactions on Image Processing}, 
  title={CBNet: A Composite Backbone Network Architecture for Object Detection}, 
  year={2022},
  volume={31},
  pages={6893-6906},
  doi={10.1109/TIP.2022.3216771}}

License

The project is only free for academic research purposes, but needs authorization for commerce. For commerce permission, please contact [email protected].

Other Links

Original CBNet: See CBNet: A Novel Composite Backbone Network Architecture for Object Detection.

More Repositories

1

M2Det

M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network
Python
1,453
star
2

CBNet_caffe

Composite Backbone Network (AAAI20)
Python
411
star
3

GALA3D

[ICML 2024] GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
HTML
245
star
4

DrivingGaussian

[CVPR 2024] DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes
206
star
5

CFENet

Comprehensive Feature Enhancement Module for Single-Shot Object Detector
198
star
6

DADA

[ECCV 2020] DADA: Differentiable Automatic Data Augmentation
Python
188
star
7

DynamicDet

[CVPR 2023] DynamicDet: A Unified Dynamic Architecture for Object Detection
Python
109
star
8

T-SEA

[CVPR 2023] T-SEA: Transfer-based Self-Ensemble Attack on Object Detection
Python
88
star
9

CBNet_pytorch

CBNet implementation based on mmdetection (AAAI 2020)
Python
84
star
10

CMUA-Watermark

[AAAI 2022] CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes
Python
81
star
11

RCBEVDet

[CVPR 2024] RCBEVDet: Radar-camera Fusion in Bird’s Eye View for 3D Object Detection
65
star
12

OPANAS

[CVPR 2021]OPANAS: One-Shot Path Aggregation Network Architecture Search for Object Detection
Python
44
star
13

RPAttack

(ICME2021) RPATTACK: REFINED PATCH ATTACK ON GENERAL OBJECT DETECTORS
Jupyter Notebook
42
star
14

HENet

[ECCV 2024] HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
42
star
15

BEV-MAE

[AAAI 2024] BEV-MAE: Bird's Eye View Masked Autoencoders for Point Cloud Pre-training in Autonomous Driving Scenarios
Python
38
star
16

QGCN

Learning a Single Model With a Wide Range of Quality Factors for JPEG Image Artifacts Removal (TIP 2020)
Python
36
star
17

STR_TPSearch

Python
22
star
18

FlowNAS

[IJCV 2023] FlowNAS: Neural Architecture Search for Optical Flow Estimation
Python
15
star
19

IterNet

Jupyter Notebook
14
star
20

GSTO

official implementation of paper: GSTO: Gated Scale-Transfer Operation for Multi-Scale Feature Learning in Pixel Labeling
Python
6
star
21

SAMPLING

[ICCV 2023] SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image
Python
4
star
22

ContinualContrastiveLearning

[ICME 2022] Continual Contrastive Learning for Image Classification
Python
4
star
23

MixTConv

Python
4
star
24

SReN_MM

Python
3
star
25

FORMULA

[WACV 2023] Foreground Guidance and Multi-Layer Feature Fusion for Unsupervised Object Discovery with Transformers
Python
2
star
26

A-quadrilateral-scene-text-detector

Python
2
star
27

BEVFusion

Python
1
star
28

VDIGPKU.github.io

CSS
1
star