• Stars
    star
    511
  • Rank 86,473 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created about 7 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Inflated i3d network with inception backbone, weights transfered from tensorflow

I3D models transfered from Tensorflow to PyTorch

This repo contains several scripts that allow to transfer the weights from the tensorflow implementation of I3D from the paper Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset by Joao Carreira and Andrew Zisserman to PyTorch.

The original (and official!) tensorflow code can be found here.

The heart of the transfer is the i3d_tf_to_pt.py script

Launch it with python i3d_tf_to_pt.py --rgb to generate the rgb checkpoint weight pretrained from ImageNet inflated initialization.

To generate the flow weights, use python i3d_tf_to_pt.py --flow.

You can also generate both in one run by using both flags simultaneously python i3d_tf_to_pt.py --rgb --flow.

Note that the master version requires PyTorch 0.3 as it relies on the recent addition of ConstantPad3d that has been included in this latest release.

If you want to use pytorch 0.2 checkout the branch pytorch-02 which contains a simplified model with even padding on all sides (and the corresponding pytorch weight checkpoints). The difference is that the 'SAME' option for padding in tensorflow allows it to pad unevenly both sides of a dimension, an effect reproduced on the master branch.

This simpler model produces scores a bit closer to the original tensorflow model on the demo sample and is also a bit faster.

Demo

There is a slight drift in the weights that impacts the predictions, however, it seems to only marginally affect the final predictions, and therefore, the converted weights should serve as a valid initialization for further finetuning.

This can be observed by evaluating the same sample as the original implementation.

For a demo, launch python i3d_pt_demo.py --rgb --flow. This script will print the scores produced by the pytorch model.

Pytorch Flow + RGB predictions:

1.0          44.53513 playing cricket
1.432034e-09 24.17096 hurling (sport)
4.385328e-10 22.98754 catching or throwing baseball
1.675852e-10 22.02560 catching or throwing softball
1.113020e-10 21.61636 hitting baseball
9.361596e-12 19.14072 playing tennis

Tensorflow Flow + RGB predictions:

1.0         41.8137 playing cricket
1.49717e-09 21.4943 hurling sport
3.84311e-10 20.1341 catching or throwing baseball
1.54923e-10 19.2256 catching or throwing softball
1.13601e-10 18.9153 hitting baseball
8.80112e-11 18.6601 playing tennis

PyTorch RGB predictions:

[playing cricket]: 9.999987E-01
[playing kickball]: 4.187616E-07
[catching or throwing baseball]: 3.255321E-07
[catching or throwing softball]: 1.335190E-07
[shooting goal (soccer)]: 8.081449E-08

Tensorflow RGB predictions:

[playing cricket]: 0.999997
[playing kickball]: 1.33535e-06
[catching or throwing baseball]: 4.55313e-07
[shooting goal (soccer)]: 3.14343e-07
[catching or throwing softball]: 1.92433e-07

PyTorch Flow predictions:

[playing cricket]: 9.365287E-01
[hurling (sport)]: 5.201872E-02
[playing squash or racquetball]: 3.165054E-03
[playing tennis]: 2.550464E-03
[hitting baseball]: 1.729896E-03

Tensorflow Flow predictions:

[playing cricket]: 0.928604
[hurling (sport)]: 0.0406825
[playing tennis]: 0.00415417
[playing squash or racquetbal]: 0.00247407
[hitting baseball]: 0.00138002

Time profiling

To time the forward and backward passes, you can install kernprof, an efficient line profiler, and then launch

kernprof -lv i3d_pt_profiling.py --frame_nb 16

This launches a basic pytorch training script on a dummy dataset that consists of replicated images as spatio-temporal inputs.

On my GeForce GTX TITAN Black (6Giga) a forward+backward pass takes roughly 0.25-0.3 seconds.

Some visualizations

Visualization of the weights and matching activations for the first convolutions

RGB

rgb_sample

Weights

rgb_weights

Activations

rgb_activations

Flow

flow_sample

Weights

flow_weights

Activations

flow_activations

More Repositories

1

manopth

MANO layer for PyTorch, generating hand meshes as a differentiable layer
Python
590
star
2

useful-computer-vision-phd-resources

Lists of resources useful for my PhD in computer vision
495
star
3

torch_videovision

Transforms for video datasets in pytorch
Python
263
star
4

obman_train

[cvpr19] Demo, training and evaluation code for generating dense hand+object reconstructions from single rgb images
Python
186
star
5

obman

[cvpr19] Hands+Objects synthetic dataset, instructions to download and code to load the dataset
Python
144
star
6

inflated_convnets_pytorch

Inflate DenseNet and ResNet as per I3D with ImageNet weight transfer
Python
129
star
7

handobjectconsist

[cvpr 20] Demo, training and evaluation code for joint hand-object pose estimation in sparsely annotated videos
Python
121
star
8

homan

[3dv 2021] Joint fitting of hands and object from short RGB video clips
Python
87
star
9

obman_render

[cvpr19] Code to generate images from the ObMan dataset, synthetic renderings of hands holding objects (or hands in isolation)
Python
77
star
10

interview-prep

Notes on preparing for coding interviews during my PhD
Python
61
star
11

inria-research-wiki

Wiki for my research notes
52
star
12

shape_sdf

Python
44
star
13

libyana

Utility functions that I reuse across different projects
Python
14
star
14

synthetic-hands

Python
12
star
15

pyrender_sdf

Minimal rendering in python for shapes defined implicitely through signed distance functions
Python
11
star
16

conda_colmap

CMake
8
star
17

multiperson

pytorch 1.6-compatible NMR
Python
6
star
18

hourglass-hands

Jupyter Notebook
3
star
19

flow-toolbox

C++
3
star
20

mva

Homeworks and project for the MVA (Mathematics, Vision, Learning) master
Jupyter Notebook
2
star