• Stars
    star
    129
  • Rank 277,653 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created over 6 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Inflate DenseNet and ResNet as per I3D with ImageNet weight transfer

Inflated I3D models with ImageNet weight transfer in PyTorch

This repo contains several scripts that allow to inflate 2D networks according to the technique described in the paper Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset by Joao Carreira and Andrew Zisserman to PyTorch.

It provides the inflated versions for :

  • ResNet 50, ResNet101, ResNet152
  • DenseNet 121, DenseNet161, DenseNet169, DenseNet201

The original (and official!) tensorflow code inflates the inception-v1 network and can be found here.

So far this code allows for the inflation of DenseNet and ResNet where the basis block is a Bottleneck block (Resnet >50), and the transfer of 2D ImageNet weights.

The 3D network is obtained by going through the layers of the 2D network and inflating them one by one. The utilities for the inflation (which both inflate the layers and transfer the weights) are located in src/inflate.py.

Note that for the ResNet inflation, I use a centered initialization scheme as presented in Detect-and-Track: Efficient Pose Estimation in Videos, where instead of replicating the kernel and scaling the weights by the time dimension (as described in the original I3D paper), I initialize the time-centered slice of the kernel to the 2D weights and the rest to 0. This allows to obtain (up to numerical differences) the same outputs for the 2D network with the image input and the matching 3D network with 3D inputs (obtained by replicating the 2D image input in the time dimension).

Use it

To inflate the network and run it on a dummy-dataset with comparison between the final predictions between the original and inflated networks run:

  • For ResNet 101 for instance, run python inflate_resnet.py --resnet_nb 101 (available for ResNet [50|101|152])

  • For DenseNet 121 python inflate_densenet.py --densenet_nb 121 (available for DenseNet [121|161|169|201])

Profiling

Forward pass on GeForce GTX TITAN Black (6Giga) GPU with batch-size 2:

Network time (s)
ResNet 50 0.6 s
ResNet 101 0.8 s
ResNet 152 1.1 s
DenseNet 121 2.6 s

Forward pass on GeForce GTX TITAN Black (6Giga) GPU with batch-size 1:

Network time (s)
ResNet 50 0.1s
ResNet 101 0.3s
ResNet 152 0.5s
DenseNet 121 1.3 s
DenseNet 161 1.8 s
DenseNet 169 1.5 s
DenseNet 201 1.7 s

Note

Another repo with networks pretrained on kinetics is available here 3D-Resnets-Pytorch. However, it does not transfer the ImageNet weights, which in my experience with inception-v1 did improve the final results.

More Repositories

1

manopth

MANO layer for PyTorch, generating hand meshes as a differentiable layer
Python
590
star
2

kinetics_i3d_pytorch

Inflated i3d network with inception backbone, weights transfered from tensorflow
Python
511
star
3

useful-computer-vision-phd-resources

Lists of resources useful for my PhD in computer vision
495
star
4

torch_videovision

Transforms for video datasets in pytorch
Python
263
star
5

obman_train

[cvpr19] Demo, training and evaluation code for generating dense hand+object reconstructions from single rgb images
Python
186
star
6

obman

[cvpr19] Hands+Objects synthetic dataset, instructions to download and code to load the dataset
Python
143
star
7

handobjectconsist

[cvpr 20] Demo, training and evaluation code for joint hand-object pose estimation in sparsely annotated videos
Python
118
star
8

homan

[3dv 2021] Joint fitting of hands and object from short RGB video clips
Python
87
star
9

obman_render

[cvpr19] Code to generate images from the ObMan dataset, synthetic renderings of hands holding objects (or hands in isolation)
Python
77
star
10

interview-prep

Notes on preparing for coding interviews during my PhD
Python
61
star
11

inria-research-wiki

Wiki for my research notes
52
star
12

shape_sdf

Python
44
star
13

libyana

Utility functions that I reuse across different projects
Python
14
star
14

synthetic-hands

Python
12
star
15

pyrender_sdf

Minimal rendering in python for shapes defined implicitely through signed distance functions
Python
11
star
16

conda_colmap

CMake
8
star
17

multiperson

pytorch 1.6-compatible NMR
Python
6
star
18

hourglass-hands

Jupyter Notebook
3
star
19

flow-toolbox

C++
3
star
20

mva

Homeworks and project for the MVA (Mathematics, Vision, Learning) master
Jupyter Notebook
2
star