• Stars
    star
    472
  • Rank 93,034 (Top 2 %)
  • Language
    Python
  • Created over 6 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Animating Arbitrary Objects via Deep Motion Transfer

Animating Arbitrary Objects via Deep Motion Transfer

This repository contains the source code for the CVPR oral paper Animating Arbitrary Objects via Deep Motion Transfer by Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci and Nicu Sebe. We call the proposed deep framework Monkey-Net, as it enables motion transfer by considering MOviNg KEYpoints. Check also the project website.

New version of the method can be found here.

Examples of motion transfer

The videos on the left show the driving videos. The first row on the right for each dataset shows the source images. The bottom row contains the animated sequences with motion transferred from the driving video and object taken from the source image. We trained a separate network for each task. Note that for each task the background, the object appearance are consistent in each generated video.

NEMO Face Dataset

Screenshot

Taichi Dataset

Screenshot

BAIR Robot Dataset

Screenshot

MGIF Dataset

Screenshot

Training and testing

Our framework can be used in several modes. In the motion transfer mode, a static image will be animated using a driving video. In the image-to-video translation mode, given a static image, the framework will predict future frames.

Installation

We support python3. To install the dependencies run:

pip install -r requirements.txt

YAML configs

There are several configuration (config/dataset_name.yaml) files one for each dataset. See config/actions.yaml to get description of each parameter.

Motion Transfer Demo

To run a demo, download a checkpoint and run the following command:

python demo.py --config  config/moving-gif.yaml --driving_video sup-mat/driving.png --source_image sup-mat/source.png --checkpoint path/to/checkpoint

The result will be stored in demo.gif.

Training

To train a model on specific dataset run:

CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml

The code will create a folder in the log directory (each run will create a time-stamped new directory). Checkpoints will be saved to this folder. To check the loss values during training in see log.txt. You can also check training data reconstructions in the train-vis subfolder.

Reconstruction

To evaluate the reconstruction performance run:

CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode reconstruction --checkpoint path/to/checkpoint

You will need to specify the path to the checkpoint, the reconstruction subfolder will be created in the checkpoint folder. The generated video will be stored to this folderenerated video there and in png subfolder loss-less verstion in '.png' format.

Motion transfer

In order to perform motion transfer run:

CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode transfer --checkpoint path/to/checkpoint

You will need to specify the path to the checkpoint, the transfer subfolder will be created in the same folder as the checkpoint. You can find the generated video there and its loss-less version in the png subfolder.

There are 2 different ways of performing transfer: by using absolute keypoint locations or by using relative keypoint locations.

  1. Absolute Transfer: the transfer is performed using the absolute postions of the driving video and appearance of the source image. In this way there are no specific requirements for the driving video and source appearance that is used. However this usually leads to poor performance since unrelevant details such as shape is transfered. Check transfer parameters in shapes.yaml to enable this mode.

  2. Realtive Transfer: from the driving video we first estimate the relative movement of each keypoint, then we add this movement to the absolute position of keypoints in the source image. This keypoint along with source image is used for transfer. This usually leads to better performance, however this requires that the object in the first frame of the video and in the source image have the same pose.

The approximately aligned pairs of videos are given in the data folder. (e.g data/taichi.csv).

Image-to-video translation

In order to perform image-to-video translation run:

CUDA_VISIBLE_DEVICES=0 python run.py --config config/dataset_name.yaml --mode prediction --checkpoint path/to/checkpoint

The following steps will be performed:

  • Estimate the keypoints from the training set
  • Train rnn to predict the keypoints
  • Run the predictor for each video in the dataset, starting from the first frame. Again the prediction subfolder will be created in the same folder as the checkpoint. You can find the generated video there and in png subfolder.

Datasets

  1. Shapes. This dataset is saved along with repository. Training takes about 1 hour.

  2. Actions. This dataset is also saved along with repository. And training takes about 4 hours.

  3. Nemo. The preprocessed version of this dataset can be downloaded. Training takes about 6 hours.

  4. Taichi. We used the same data as MoCoGAN. Training takes about 15 hours.

  5. Bair. The preprocessed version of this dataset can be downloaded. Training takes about 4 hours.

  6. MGif. The preprocessed version of this dataset can be downloaded. Check for details on this dataset. Training takes about 8 hours, on 2 gpu.

  7. Vox. The dataset can be downloaded and preprocessed using a script: cd data; ./get_vox.sh.

Training on your own dataset

  1. Resize all the videos to the same size e.g 128x128, the videos can be in '.gif' or '.mp4' format. But we recommend to make them stacked '.png' (see data/shapes), because this format is lossless.

  2. Create a folder data/dataset_name with 2 subfolders train and test, put training videos in the train and testing in the test.

  3. Create a config config/dataset_name.yaml (it is better to start from one of the existing configs, for 64x64 videos config/nemo.yaml, for 128x128 config\moving-gif.yaml, for 256x256 config\vox.yaml), in dataset_params specify the root dir the root_dir: data/dataset_name. Also adjust the number of epoch in train_params.

Additional notes

Citation:

@InProceedings{Siarohin_2019_CVPR,
  author={Siarohin, Aliaksandr and Lathuilière, Stéphane and Tulyakov, Sergey and Ricci, Elisa and Sebe, Nicu},
  title={Animating Arbitrary Objects via Deep Motion Transfer},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2019}
}

More Repositories

1

first-order-model

This repository contains the source code for the paper First Order Motion Model for Image Animation
Jupyter Notebook
14,508
star
2

motion-cosegmentation

Reference code for "Motion-supervised Co-Part Segmentation" paper
Jupyter Notebook
655
star
3

video-preprocessing

Python
512
star
4

pose-gan

Python
382
star
5

cuda-gridsample-grad2

Cuda implementation for gridsample with second derivative support
Python
78
star
6

pose-evaluation

Metrics for evaluation pose-guided image generation
Python
73
star
7

wc-gan

Whitening and Coloring transform for GANs
Python
34
star
8

mem-transfer

Deep learning models for transfering image memorability
Python
9
star
9

BAE

Bayesian Style Generation for Enhancing Perceptual Attributes
Python
6
star
10

cycle-gan

Python
5
star
11

first-order-model-website

Project website for first order model
HTML
5
star
12

aliaksandr-siarohin-website

aliaksandr-siarohin-website
CSS
4
star
13

H4IG

How important is invariance in GAN?
Python
2
star
14

AppliedRobotics

Repository for applied robotics course
HTML
2
star
15

mim-gan

Python
1
star
16

tic-tac-toe

Jupyter Notebook
1
star
17

manga-color

Coloring of manga using deep learning
Python
1
star
18

reflections

1
star
19

vitvin_genes

Vitis vinifera pathway expansion
Jupyter Notebook
1
star
20

SHAD_ALGORITHMS

C++
1
star
21

rt_simulator

Simulator of real time taskset
C++
1
star
22

GarchModels

R
1
star
23

video_da_baseline

Python
1
star
24

mem-optimization

Optimizing memorability of images using NN
Python
1
star
25

cv-sender

Send a cv to all emails
Python
1
star
26

sentence-compressor

Jupyter Notebook
1
star
27

zwitter_metrics

This is a repository for the first asssigment in bigdatashad-2016.
Python
1
star
28

SHAD_CPP

C++
1
star
29

gan

Gan implimentation in keras
Python
1
star
30

eq-inv

Python
1
star