• Stars
    star
    287
  • Rank 144,232 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created about 4 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ICCV 2021] Code for approximated exponential maximum pooling

Refining activation downsampling with SoftPool

supported versions Library GitHub license


Update 10/2021:

We have extended this work with in our paper: AdaPool: Exponential Adaptive Pooling for Information-Retaining Downsampling. Info, code and resources are available at alexandrosstergiou/adaPool

Abstract

Convolutional Neural Networks (CNNs) use pooling to decrease the size of activation maps. This process is crucial to increase the receptive fields and to reduce computational requirements of subsequent convolutions. An important feature of the pooling operation is the minimization of information loss, with respect to the initial activation maps, without a significant impact on the computation and memory overhead. To meet these requirements, we propose SoftPool: a fast and efficient method for exponentially weighted activation downsampling. Through experiments across a range of architectures and pooling methods, we demonstrate that SoftPool can retain more information in the reduced activation maps. This refined downsampling leads to improvements in a CNN's classification accuracy. Experiments with pooling layer substitutions on ImageNet1K show an increase in accuracy over both original architectures and other pooling methods. We also test SoftPool on video datasets for action recognition. Again, through the direct replacement of pooling layers, we observe consistent performance improvements while computational loads and memory requirements remain limited.


To appear in IEEE International Conference on Computer Vision (ICCV) 2021

[arXiv preprint] Β Β Β  [CVF open access] Β Β Β  [video presentation]

Image based pooling. Images are sub-sampled in both height and width by half.

Original
Soft Pool

Video based pooling. Videos are sub-sampled in time, height and width by half.

Original
Soft Pool

Dependencies

All parts of the code assume that torch is of version 1.4 or higher. There might be instability issues on previous versions.

! Disclaimer: This repository is heavily structurally influenced on Ziteng Gao's LIP repo https://github.com/sebgao/LIP

Installation

You can build the repo through the following commands:

$ git clone https://github.com/alexandrosstergiou/SoftPool.git
$ cd SoftPool-master/pytorch
$ make install
--- (optional) ---
$ make test

Usage

You can load any of the 1D, 2D or 3D variants after the installation with:

import softpool_cuda
from SoftPool import soft_pool1d, SoftPool1d
from SoftPool import soft_pool2d, SoftPool2d
from SoftPool import soft_pool3d, SoftPool3d
  • soft_poolxd: Is a functional interface for SoftPool.
  • SoftPoolxd: Is the class-based version which created an object that can be referenced later in the code.

ImageNet models

ImageNet weight can be downloaded from the following links:

Network link
ResNet-18 link
ResNet-34 link
ResNet-50 link
ResNet-101 link
ResNet-152 link
DenseNet-121 link
DenseNet-161 link
DenseNet-169 link
ResNeXt-50_32x4d link
ResNeXt-101_32x4d link
wide-ResNet50 link

Citation

@inproceedings{stergiou2021refining,
  title={Refining activation downsampling with SoftPool},
  author={Stergiou, Alexandros, Poppe, Ronald and Kalliatakis Grigorios},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2021},
  pages={10357-10366},
  organization={IEEE}
}

Licence

MIT

Additional resources

A great project is Ren Tianhe's pytorh-pooling repo for overviewing different pooling strategies.

More Repositories

1

adaPool

[T-IP 2023] Code for exponential adaptive pooling for PyTorch
Cuda
79
star
2

keras-DepthwiseConv3D

Keras w/ Tensorflow backend implementation for 3D channel-wise convolutions
Python
68
star
3

Squeeze-and-Recursion-Temporal-Gates

Code for : [Pattern Recognit. Lett. 2021] "Learn to cycle: Time-consistent feature discovery for action recognition" and [IJCNN 2021] "Multi-Temporal Convolutions for Human Action Recognition in Videos".
Python
67
star
4

Saliency-Tubes-Visual-Explanations-for-Spatio-Temporal-Convolutions

[ICIP 2019] Implementation of Saliency Tubes for 3D Convolutions in Pytoch and Keras to localise the focus spatio-temporal regions of 3D CNNs.
Python
51
star
5

Inter4K

Official repository for downloading and using Inter4K video interpolation dataset
Python
25
star
6

progressive-action-prediction

[CVPR 2023] Code for action prediction from videos
Python
22
star
7

Class_Feature_Visualization_Pyramid

[ICCVW 2019] PyTorch code for Class Visualization Pyramid for intpreting spatio-temporal class-specific activations throughout the network
Python
21
star
8

Traffic-Sign-Recognition-basd-on-Synthesised-Training-Data

Using synthetic data in combination with Deep Learning, to determine if a system can be made that will be able to recognise and classify correctly real traffic signs.
Jupyter Notebook
19
star
9

Class-Agnostic-Feature-Visualisation

[ICIP 2021] PyTorch code for "The Mind's Eye: Visualizing Class-Agnostic Features of CNNs" for generation of kernel features.
Python
11
star
10

PlayItBack

[ICASSP 2023] PyTorch code for "Play It Back: Iterative Attention for Audio Recognition"
Python
9
star
11

Inception_v3_TV_Human_Interactions

Applying Transfer Learning on Inception V3 model (weights trained on Imagenet) for the Oxford TV Human Interactions dataset. The network gets as inputs images extracted every 5 frames from videos.
Python
8
star
12

LAVIB

Official repository for downloading and using LAVIB
Python
8
star
13

TrajREC

[WACV 2024] Code for multitask trajectory anomaly detection
Python
7
star
14

dataset2database

Script for creating SQL databases from video files to reduce random access overhead and inodes
Python
7
star
15

Leaping-Into-Memories

[ICCV 2023] Code implementation for "Leaping Into Memories: Space-Time Deep Feature Synthesis"
Python
3
star
16

video2spectrogram

Python script for extracting audio from video files and creating Mel spectrograms
Python
2
star
17

Datasets-Visualisations-w-OpenCV

Creating grid visualisations of dataset examples with the OpenCV library
Python
1
star
18

macrawlon

A sweet little collection of handy functions for video file downloading. πŸ“Ό
Python
1
star
19

Bootstrap-for-polling

πŸ“Š Use bootstrapping to normalise the data from opinion polls and make them similar to the population they represent.
Jupyter Notebook
1
star