Object level Visual Reasoning in Videos
This repository contains a Pytorch implementation of "Object level Visual Reasoning in Videos", F. Baradel, N. Neverova, C. Wolf, J. Mille, G. Mori, In ECCV 2018.
Links: Project page | Camera-ready | Complementary Mask Data
Code
We release code for training and testing our implementation. We encourage you to follow the steps below:
- preprocessing the video dataset
- rescaling an entire dataset (WxH=256x256 and fps=30)
- testing the dataloader
- efficient video decoding on the fly
- training/testing the model
- training procedure using precomputed masks
Masks
Please visit the following website for downloading the mask predictions.
Requirements
- pytorch 0.4.0
- numpy
- lintel - make sure that you have already installed this library (important for decoding videos on the fly)
Citation
If you find this paper or our implementation useful for your research or if you use the precomputed masks, please cite our paper.
@InProceedings{Baradel_2018_ECCV,
author = {Baradel, Fabien and Neverova, Natalia and Wolf, Christian and Mille, Julien and Mori, Greg},
title = {Object Level Visual Reasoning in Videos},
booktitle = {ECCV},
year = {2018}
}
Acknowledgements
This work was funded by grant Deepvision (ANR-15- CE23-0029, STPGP-479356-15), a joint French/Canadian call by ANR & NSERC.
Licence
MIT License