• Stars
    star
    513
  • Rank 86,178 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 4 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Extract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.

Video Features

video_features allows you to extract features from video clips. It supports a variety of extractors and modalities, i.e. visual appearance, optical flow, and audio. See more details in Documentation.

Supported Models

Action Recognition

Sound Recognition

Optical Flow

Frame-wise Features

Quick Start

Open In Colab

or run with conda locally:

# clone the repo and change the working directory
git clone https://github.com/v-iashin/video_features.git
cd video_features

# install environment
conda env create -f conda_env_torch_zoo.yml

# load the environment
conda activate torch_zoo

# extract r(2+1)d features for the sample videos
python main.py \
    feature_type=r21d \
    device="cuda:0" \
    video_paths="[./sample/v_ZNVhz7ctTq0.mp4, ./sample/v_GGSY1Qvo990.mp4]"

# if you have many GPUs, just run this command from another terminal with another device
# device can also be "cpu"

If you are more comfortable with Docker, there is a Docker image with a pre-installed environment that supports all models. Check out the Docker support. documentation page.

Multi-GPU and Multi-Node Setups

With video_features, it is easy to parallelize feature extraction among many GPUs. It is enough to start the script in another terminal with another GPU (or even the same one) pointing to the same output folder and input video paths. The script will check if the features already exist and skip them. It will also try to load the feature file to check if it is corrupted (i.e. not openable). This approach allows you to continue feature extraction if the previous script failed for some reason.

If you have an access to a GPU cluster with shared disk space you may scale extraction with as many GPUs as you can by creating several single-GPU jobs with the same command.

Since each time the script is run the list of input files is shuffled, you don't need to worry that workers will be processing the same video. On a rare occasion when the collision happens, the script will rewrite previously extracted features.

Used in

Please, let me know if you found this repo useful for your projects or papers.

Acknowledgements

  • @Kamino666: added CLIP model as well as Windows and CPU support (and many other small things).
  • @borijang: for solving bugs with file names, I3D checkpoint loading enhancement and code style improvements.
  • @ohjho: added support of 37-layer R(2+1)d favors.

More Repositories

1

SpecVQGAN

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)
Jupyter Notebook
345
star
2

BMT

Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
Jupyter Notebook
225
star
3

MDVC

PyTorch implementation of Multi-modal Dense Video Captioning (CVPR 2020 Workshops)
Python
143
star
4

CS231n

PyTorch/Tensorflow solutions for Stanford's CS231n: "CNNs for Visual Recognition"
Jupyter Notebook
51
star
5

SparseSync

Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
Python
50
star
6

WebsiteYOLO

The back-end for the YOLOv3 object detector running as a webapp
Python
44
star
7

Synchformer

Efficient synchronization from sparse cues
Python
26
star
8

VoxCeleb

An attempt to replicate the results of [1706.08612] VoxCeleb: a large-scale speaker identification dataset
Jupyter Notebook
12
star
9

CORSMAL

πŸ† πŸ† Top-1 Submission to CORSMAL Challenge 2020 (at ICPR). The winning solution for the CORSMAL Challenge (on Intelligent Sensing Summer School 2020)
Jupyter Notebook
8
star
10

JumpMethod

Selecting a Proper Number of Clusters Using Jumps Method
Jupyter Notebook
6
star
11

v-iashin.github.io

Personal webpage
HTML
6
star
12

FoursquareAPI

A simple example of using Foursquare API to get the data about a venue (Tips, UserId, and other info) and the data about a user (homecity, gender, number of friends, lists, checkins, photos, and tips) on Python
Jupyter Notebook
6
star
13

CrossEntropyTSP

An implementation of an approximation of the solution to Traveling Salesman Problem using cross entropy approach on Python 3
Jupyter Notebook
5
star
14

LearnablePINs

An attempt to replicate the results of [1805.00833] Learnable PINs: Cross-Modal Embeddings for Person Identity
Jupyter Notebook
4
star
15

TuniSurvivalKit

The ultimate survival kit for Ph.D. students at Tampere University
2
star
16

EM

EM-algorithm for two 1-dimentional Gaussians on vanilla-Python
Python
1
star
17

CopulaDensityEstimator

Recursive non-parametric estimation of the copula density
TeX
1
star
18

SamplingChessboardWithQueens

Simulation of an N by M chessboard with K queens such that no queen defeats another using Simulated Annealing
R
1
star