AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
This repository contains the official implementation of the following paper:
AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation
Zhen Li*, Zuo-Liang Zhu*, Ling-Hao Han, Qibin Hou, Chun-Le Guo, Ming-Ming Cheng
(* denotes equal contribution)
Nankai University
In CVPR 2023
[Paper] [Project Page] [Web demos] [Video]
AMT is a lightweight, fast, and accurate algorithm for Frame Interpolation. It aims to provide practical solutions for video generation from a few given frames (at least two frames).
- More examples can be found in our project page.
Web demos
Integrated into Hugging Face Spaces 🤗 using Gradio. Try out the Web Demo:
Try AMT to interpolate between two or more images at
Change Log
- Apr 20, 2023: Our code is publicly available.
Method Overview
For technical details, please refer to the method.md file, or read the full report on arXiv.
Dependencies and Installation
-
Clone Repo
git clone https://github.com/MCG-NKU/AMT.git
-
Create Conda Environment and Install Dependencies
conda env create -f environment.yaml conda activate amt
-
Download pretrained models for demos from Pretrained Models and place them to the
pretrained
folder
Quick Demo
Note that the selected pretrained model ([CKPT_PATH]
) needs to match the config file ([CFG]
).
Creating a video demo, increasing
$n$ will slow down the motion in the video. (With$m$ input frames,[N_ITER]
$=n$ corresponds to$2^n\times (m-1)+1$ output frames.)
python demos/demo_2x.py -c [CFG] -p [CKPT] -n [N_ITER] -i [INPUT] -o [OUT_PATH] -r [FRAME_RATE]
# e.g. [INPUT]
# -i could be a video / a regular expression / a folder contains multiple images
# -i demo.mp4 (video)/img_*.png (regular expression)/img0.png img1.png (images)/demo_input (folder)
# e.g. a simple usage
python demos/demo_2x.py -c cfgs/AMT-S.yaml -p pretrained/amt-s.pth -n 6 -i assets/quick_demo/img0.png assets/quick_demo/img1.png
- Note: Please enable
--save_images
for saving the output images (Save speed will be slowed down if there are too many output images) - Input type supported:
a video
/a regular expression
/multiple images
/a folder containing input frames
. - Results are in the
[OUT_PATH]
(default isresults/2x
) folder.
Pretrained Models
Dataset | |
Config file | Trained on | Arbitrary/Fixed |
---|---|---|---|---|
AMT-S | [Google Driver][Baidu Cloud][Hugging Face] | [cfgs/AMT-S] | Vimeo90k | Fixed |
AMT-L | [Google Driver][Baidu Cloud][Hugging Face] | [cfgs/AMT-L] | Vimeo90k | Fixed |
AMT-G | [Google Driver][Baidu Cloud][Hugging Face] | [cfgs/AMT-G] | Vimeo90k | Fixed |
AMT-S | [Google Driver][Baidu Cloud][Hugging Face] | [cfgs/AMT-S_gopro] | GoPro | Arbitrary |
Training and Evaluation
Please refer to develop.md to learn how to benchmark the AMT and how to train a new AMT model from scratch.
Citation
If you find our repo useful for your research, please consider citing our paper:
@inproceedings{licvpr23amt,
title={AMT: All-Pairs Multi-Field Transforms for Efficient Frame Interpolation},
author={Li, Zhen and Zhu, Zuo-Liang and Han, Ling-Hao and Hou, Qibin and Guo, Chun-Le and Cheng, Ming-Ming},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2023}
}
License
This code is licensed under the Creative Commons Attribution-NonCommercial 4.0 International for non-commercial use only. Please note that any commercial use of this code requires formal permission prior to use.
Contact
For technical questions, please contact zhenli1031[AT]gmail.com
and nkuzhuzl[AT]gmail.com
.
For commercial licensing, please contact cmm[AT]nankai.edu.cn
Acknowledgement
We thank Jia-Wen Xiao, Zheng-Peng Duan, Rui-Qi Wu, and Xin Jin for proof reading. We thank Zhewei Huang for his suggestions.
Here are some great resources we benefit from:
- IFRNet and RIFE for data processing, benchmarking, and loss designs.
- RAFT, M2M-VFI, and GMFlow for inspirations.
- FILM for Web demo reference.
If you develop/use AMT in your projects, welcome to let us know. We will list your projects in this repository.
We also thank all of our contributors.