• Stars
    star
    1,307
  • Rank 35,745 (Top 0.8 %)
  • Language
    Python
  • License
    MIT License
  • Created over 3 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR 2021] Involution: Inverting the Inherence of Convolution for Visual Recognition, a brand new neural operator

involution

Official implementation of a neural operator as described in Involution: Inverting the Inherence of Convolution for Visual Recognition (CVPR'21)

By Duo Li, Jie Hu, Changhu Wang, Xiangtai Li, Qi She, Lei Zhu, Tong Zhang, and Qifeng Chen

TL; DR. involution is a general-purpose neural primitive that is versatile for a spectrum of deep learning models on different vision tasks. involution bridges convolution and self-attention in design, while being more efficient and effective than convolution, simpler than self-attention in form.

If you find our work useful in your research, please cite:

@InProceedings{Li_2021_CVPR,
    author = {Li, Duo and Hu, Jie and Wang, Changhu and Li, Xiangtai and She, Qi and Zhu, Lei and Zhang, Tong and Chen, Qifeng},
    title = {Involution: Inverting the Inherence of Convolution for Visual Recognition},
    booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month = {June},
    year = {2021}
}

Getting Started

This repository is fully built upon the OpenMMLab toolkits. For each individual task, the config and model files follow the same directory organization as mmcls, mmdet, and mmseg respectively, so just copy-and-paste them to the corresponding locations to get started.

For example, in terms of evaluating detectors

git clone https://github.com/open-mmlab/mmdetection # and install

# copy model files
cp det/mmdet/models/backbones/* mmdetection/mmdet/models/backbones
cp det/mmdet/models/necks/* mmdetection/mmdet/models/necks
cp det/mmdet/models/dense_heads/* mmdetection/mmdet/models/dense_heads
cp det/mmdet/models/roi_heads/* mmdetection/mmdet/models/roi_heads
cp det/mmdet/models/roi_heads/mask_heads/* mmdetection/mmdet/models/roi_heads/mask_heads
cp det/mmdet/models/utils/* mmdetection/mmdet/models/utils
cp det/mmdet/datasets/* mmdetection/mmdet/datasets

# copy config files
cp det/configs/_base_/models/* mmdetection/configs/_base_/models
cp det/configs/_base_/schedules/* mmdetection/configs/_base_/schedules
cp det/configs/involution mmdetection/configs -r

# evaluate checkpoints
cd mmdetection
bash tools/dist_test.sh ${CONFIG_FILE} ${CHECKPOINT_FILE} ${GPU_NUM} [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]

For more detailed guidance, please refer to the original mmcls, mmdet, and mmseg tutorials.

Currently, we provide an memory-efficient implementation of the involuton operator based on CuPy. Please install this library in advance. A customized CUDA kernel would bring about further acceleration on the hardware. Any contribution from the community regarding this is welcomed!

Model Zoo

The parameters/FLOPsโ†“ and performanceโ†‘ compared to the convolution baselines are marked in the parentheses. Part of these checkpoints are obtained in our reimplementation runs, whose performance may show slight differences with those reported in our paper. Models are trained with 64 GPUs on ImageNet, 8 GPUs on COCO, and 4 GPUs on Cityscapes.

Image Classification on ImageNet

Model Params(M) FLOPs(G) Top-1 (%) Top-5 (%) Config Download
RedNet-26 9.23(32.8%โ†“) 1.73(29.2%โ†“) 75.96 93.19 config model | log
RedNet-38 12.39(36.7%โ†“) 2.22(31.3%โ†“) 77.48 93.57 config model | log
RedNet-50 15.54(39.5%โ†“) 2.71(34.1%โ†“) 78.35 94.13 config model | log
RedNet-101 25.65(42.6%โ†“) 4.74(40.5%โ†“) 78.92 94.35 config model | log
RedNet-152 33.99(43.5%โ†“) 6.79(41.4%โ†“) 79.12 94.38 config model | log

Before finetuning on the following downstream tasks, download the ImageNet pre-trained RedNet-50 weights and set the pretrained argument in det/configs/_base_/models/*.py or seg/configs/_base_/models/*.py to your local path.

Object Detection and Instance Segmentation on COCO

Faster R-CNN

Backbone Neck Head Style Lr schd Params(M) FLOPs(G) box AP Config Download
RedNet-50-FPN convolution convolution pytorch 1x 31.6(23.9%โ†“) 177.9(14.1%โ†“) 39.5(1.8โ†‘) config model | log
RedNet-50-FPN involution convolution pytorch 1x 29.5(28.9%โ†“) 135.0(34.8%โ†“) 40.2(2.5โ†‘) config model | log
RedNet-50-FPN involution involution pytorch 1x 29.0(30.1%โ†“) 91.5(55.8%โ†“) 39.2(1.5โ†‘) config model | log

Mask R-CNN

Backbone Neck Head Style Lr schd Params(M) FLOPs(G) box AP mask AP Config Download
RedNet-50-FPN convolution convolution pytorch 1x 34.2(22.6%โ†“) 224.2(11.5%โ†“) 39.9(1.5โ†‘) 35.7(0.6โ†‘) config model | log
RedNet-50-FPN involution convolution pytorch 1x 32.2(27.1%โ†“) 181.3(28.5%โ†“) 40.8(2.4โ†‘) 36.4(1.3โ†‘) config model | log
RedNet-50-FPN involution involution pytorch 1x 29.5(33.3%โ†“) 104.6(58.7%โ†“) 39.6(1.2โ†‘) 35.1(0.0โ†‘) config model | log

RetinaNet

Backbone Neck Style Lr schd Params(M) FLOPs(G) box AP Config Download
RedNet-50-FPN convolution pytorch 1x 27.8(26.3%โ†“) 210.1(12.2%โ†“) 38.2(1.6โ†‘) config model | log
RedNet-50-FPN involution pytorch 1x 26.3(30.2%โ†“) 199.9(16.5%โ†“) 38.2(1.6โ†‘) config model | log

Semantic Segmentation on Cityscapes

Method Backbone Neck Crop Size Lr schd Params(M) FLOPs(G) mIoU Config download
FPN RedNet-50 convolution 512x1024 80000 18.5(35.1%โ†“) 293.9(19.0%โ†“) 78.0(3.6โ†‘) config model | log
FPN RedNet-50 involution 512x1024 80000 16.4(42.5%โ†“) 205.2(43.4%โ†“) 79.1(4.7โ†‘) config model | log
UPerNet RedNet-50 convolution 512x1024 80000 56.4(15.1%โ†“) 1825.6(3.6%โ†“) 80.6(2.4โ†‘) config model | log

More Repositories

1

mobilenetv2.pytorch

72.8% MobileNetV2 1.0 model on ImageNet and a spectrum of pre-trained MobileNetV2 models
Python
663
star
2

mobilenetv3.pytorch

74.3% MobileNetV3-Large and 67.2% MobileNetV3-Small model on ImageNet
Python
514
star
3

efficientnetv2.pytorch

PyTorch implementation of EfficientNetV2 family
Python
450
star
4

octconv.pytorch

PyTorch implementation of Octave Convolution with pre-trained Oct-ResNet and Oct-MobileNet models
Python
290
star
5

PSConv

[ECCV 2020] PSConv: Squeezing Feature Pyramid into One Compact Poly-Scale Convolutional Layer
Python
175
star
6

face-attribute-prediction

Face Attribute Prediction on CelebA benchmark with PyTorch Implementation
Python
139
star
7

HBONet

[ICCV 2019] Harmonious Bottleneck on Two Orthogonal Dimensions, surpassing MobileNetV2
Python
103
star
8

ghostnet.pytorch

73.6% GhostNet 1.0x pre-trained model on ImageNet
Python
88
star
9

DHM

[CVPR 2020] Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives
Python
84
star
10

dgconv.pytorch

PyTorch implementation of Dynamic Grouping Convolution and Groupable ConvNet with pre-trained G-ResNeXt models
Python
69
star
11

regnet.pytorch

PyTorch-style and human-readable RegNet with a spectrum of pre-trained models
Python
68
star
12

lambda.pytorch

PyTorch implementation of Lambda Network and pretrained Lambda-ResNet
Python
54
star
13

SAN

[ECCV 2020] Scale Adaptive Network: Learning to Learn Parameterized Classification Networks for Scalable Input Images
Python
43
star
14

mlp-mixer.pytorch

PyTorch implementation of MLP-Mixer
Python
36
star
15

condconv.pytorch

PyTorch implementation of CondConv and MobileNetV2 model
Python
34
star
16

mobilenext.pytorch

Rethinking Bottleneck Structure for Efficient Mobile Network Design
Python
13
star
17

dot-product-attention

A collection of self-attention modules and pre-trained backbones
Python
13
star
18

mobilenetv4.pytorch

PyTorch implementation of MobileNetV4 family
Python
12
star
19

efficientnet-lite.pytorch

PyTorch implementation of EfficientNet-lite and a spectrum of pre-trained models on ImageNet
Python
10
star
20

deeplearning.ai-CNN

Implementation of course Convolutional Neural Networks created by deeplearning.ai on Coursera
Jupyter Notebook
3
star