• Stars
    star
    325
  • Rank 129,350 (Top 3 %)
  • Language
    Python
  • License
    Other
  • Created over 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Codes for "AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos"

AnimeSR (NeurIPS 2022)

📖 AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos

arXiv
Yanze Wu, Xintao Wang, Gen Li, Ying Shan
Tencent ARC Lab; Platform Technologies, Tencent Online Video

🚩 Updates

  • 2022.11.28: release codes&models.
  • 2022.08.29: release AVC-Train and AVC-Test.

Web Demo and API

Replicate

Video Demos

AnimeSR_video_demo_tom.jerry.mp4
AnimeSR_video_demo_timon.pubaa.mp4

🔧 Dependencies and Installation

Installation

  1. Clone repo

    git clone https://github.com/TencentARC/AnimeSR.git
    cd AnimeSR
  2. Install

    # Install dependent packages
    pip install -r requirements.txt
    
    # Install AnimeSR
    python setup.py develop

âš¡ Quick Inference

Download the pre-trained AnimeSR models [Google Drive], and put them into the weights folder. Currently, the available pre-trained models are:

  • AnimeSR_v1-PaperModel.pth: v1 model, also the paper model. You can use this model for paper results reproducing.
  • AnimeSR_v2.pth: v2 model. Compare with v1, this version has better naturalness, fewer artifacts, and better texture/background restoration. If you want better results, use this model.

AnimeSR supports both frames and videos as input for inference. We provide several sample test cases in google drive, you can download it and put them to inputs folder.

Inference on Frames

python scripts/inference_animesr_frames.py -i inputs/tom_and_jerry -n AnimeSR_v2 --expname animesr_v2 --save_video_too --fps 20
Usage:
  -i --input           Input frames folder/root. Support first level dir (i.e., input/*.png) and second level dir (i.e., input/*/*.png)
  -n --model_name      AnimeSR model name. Default: AnimeSR_v2, can also be AnimeSR_v1-PaperModel
  -s --outscale        The netscale is x4, but you can achieve arbitrary output scale (e.g., x2 or x1) with the argument outscale.
                       The program will further perform cheap resize operation after the AnimeSR output. Default: 4
  -o --output          Output root. Default: results
  -expname             Identify the name of your current inference. The outputs will be saved in $output/$expname
  -save_video_too      Save the output frames to video. Default: off
  -fps                 The fps of the (possible) saved videos. Default: 24

After run the above command, you will get the SR frames in results/animesr_v2/frames and the SR video in results/animesr_v2/videos.

Inference on Video

# single gpu and single process inference
CUDA_VISIBLE_DEVICES=0 python scripts/inference_animesr_video.py -i inputs/TheMonkeyKing1965.mp4 -n AnimeSR_v2 -s 4 --expname animesr_v2 --num_process_per_gpu 1 --suffix 1gpu1process
# single gpu and multi process inference (you can use multi-processing to improve GPU utilization)
CUDA_VISIBLE_DEVICES=0 python scripts/inference_animesr_video.py -i inputs/TheMonkeyKing1965.mp4 -n AnimeSR_v2 -s 4 --expname animesr_v2 --num_process_per_gpu 3 --suffix 1gpu3process
# multi gpu and multi process inference
CUDA_VISIBLE_DEVICES=0,1 python scripts/inference_animesr_video.py -i inputs/TheMonkeyKing1965.mp4 -n AnimeSR_v2 -s 4 --expname animesr_v2 --num_process_per_gpu 3 --suffix 2gpu6process
Usage:
  -i --input           Input video path or extracted frames folder
  -n --model_name      AnimeSR model name. Default: AnimeSR_v2, can also be AnimeSR_v1-PaperModel
  -s --outscale        The netscale is x4, but you can achieve arbitrary output scale (e.g., x2 or x1) with the argument outscale.
                       The program will further perform cheap resize operation after the AnimeSR output. Default: 4
  -o -output           Output root. Default: results
  -expname             Identify the name of your current inference. The outputs will be saved in $output/$expname
  -fps                 The fps of the (possible) saved videos. Default: None
  -extract_frame_first If input is a video, you can still extract the frames first, other wise AnimeSR will read from stream
  -num_process_per_gpu Since the slow I/O speed will make GPU utilization not high enough, so as long as the
                       video memory is sufficient, we recommend placing multiple processes on one GPU to increase the utilization of each GPU.
                       The total process will be number_process_per_gpu * num_gpu
  -suffix              You can add a suffix string to the sr video name, for example, 1gpu3processx2 which means the SR video is generated with one GPU and three process and the outscale is x2
  -half                Use half precision for inference, it won't make big impact on the visual results

SR videos are saved in results/animesr_v2/videos/$video_name folder.

If you are looking for portable executable files, you can try our realesr-animevideov3 model which shares the similar technology with AnimeSR.

💻 Training

See Training.md

Request for AVC-Dataset

  1. Download and carefully read the LICENSE AGREEMENT PDF file.
  2. If you understand, acknowledge, and agree to all the terms specified in the LICENSE AGREEMENT. Please email [email protected] with the LICENSE AGREEMENT PDF file, your name, and institution. We will keep the license and send the download link of AVC dataset to you.

Acknowledgement

This project is build based on BasicSR.

Citation

If you find this project useful for your research, please consider citing our paper:

@InProceedings{wu2022animesr,
  author={Wu, Yanze and Wang, Xintao and Li, Gen and Shan, Ying},
  title={AnimeSR: Learning Real-World Super-Resolution Models for Animation Videos},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

📧 Contact

If you have any question, please email [email protected].

More Repositories

1

GFPGAN

GFPGAN aims at developing Practical Algorithms for Real-world Face Restoration.
Python
35,397
star
2

PhotoMaker

PhotoMaker [CVPR 2024]
Jupyter Notebook
9,198
star
3

T2I-Adapter

T2I-Adapter
Python
3,377
star
4

InstantMesh

InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Python
2,928
star
5

BrushNet

[ECCV 2024] The official implementation of paper "BrushNet: A Plug-and-Play Image Inpainting Model with Decomposed Dual-Branch Diffusion"
Python
1,298
star
6

MotionCtrl

Official Code for MotionCtrl [SIGGRAPH 2024]
Python
1,247
star
7

MasaCtrl

[ICCV 2023] Consistent Image Synthesis and Editing
Python
699
star
8

SEED-Story

SEED-Story: Multimodal Long Story Generation with Large Language Model
Python
657
star
9

LLaMA-Pro

[ACL 2024] Progressive LLaMA with Block Expansion.
Python
459
star
10

Mix-of-Show

NeurIPS 2023, Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
Python
383
star
11

Open-MAGVIT2

Open-MAGVIT2: Democratizing Autoregressive Visual Generation
Python
376
star
12

VQFR

ECCV 2022, Oral, VQFR: Blind Face Restoration with Vector-Quantized Dictionary and Parallel Decoder
Python
320
star
13

CustomNet

Python
258
star
14

SmartEdit

Official code of SmartEdit [CVPR-2024 Highlight]
Python
214
star
15

UMT

UMT is a unified and flexible framework which can handle different input modality combinations, and output video moment retrieval and/or highlight detection results.
Python
186
star
16

MM-RealSR

Codes for "Metric Learning based Interactive Modulation for Real-World Super-Resolution"
Python
152
star
17

ViT-Lens

[CVPR 2024] ViT-Lens: Towards Omni-modal Representations
Python
148
star
18

MCQ

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).
Python
136
star
19

DeSRA

Official codes for DeSRA (ICML 2023)
Python
123
star
20

FAIG

NeurIPS 2021, Spotlight, Finding Discriminative Filters for Specific Degradations in Blind Super-Resolution
Python
118
star
21

ArcNerf

Nerf and extensions in all
Jupyter Notebook
106
star
22

ST-LLM

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
Python
96
star
23

SurfelNeRF

SurfelNeRF: Neural Surfel Radiance Fields for Online Photorealistic Reconstruction of Indoor Scenes
76
star
24

RepSR

Codes for "RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization"
74
star
25

mllm-npu

mllm-npu: training multimodal large language models on Ascend NPUs
Python
67
star
26

HOSNeRF

HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video
Python
65
star
27

FastRealVSR

Codes for "Mitigating Artifacts in Real-World Video Super-Resolution Models"
59
star
28

ConMIM

Official codes for ConMIM (ICLR 2023)
Python
57
star
29

GVT

Official code for "What Makes for Good Visual Tokenizers for Large Language Models?".
Python
54
star
30

TVTS

Turning to Video for Transcript Sorting
Jupyter Notebook
44
star
31

BEBR

Official code for "Binary embedding based retrieval at Tencent"
Python
42
star
32

ViSFT

Python
33
star
33

pi-Tuning

Official code for "pi-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation", ICML 2023.
Python
32
star
34

FLM

Accelerating Vision-Language Pretraining with Free Language Modeling (CVPR 2023)
Python
31
star
35

Efficient-VSR-Training

Codes for "Accelerating the Training of Video Super-Resolution"
30
star
36

DTN

Official code for "Dynamic Token Normalization Improves Vision Transformer", ICLR 2022.
Python
27
star
37

OpenCompatible

OpenCompatible provides a standard compatible training benchmark, covering practical training scenarios.
Python
24
star
38

BTS

BTS: A Bi-lingual Benchmark for Text Segmentation in the Wild
23
star
39

SGAT4PASS

This is the official implementation of the paper SGAT4PASS: Spherical Geometry-Aware Transformer for PAnoramic Semantic Segmentation (IJCAI 2023)
Python
23
star
40

SFDA

Python
20
star
41

TaCA

Official code for the paper, "TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter".
15
star
42

Plot2Code

Python
14
star
43

common_trainer

Common template for pytorch project. Easy to extent and modify for new project.
Python
12
star
44

TransFusion

The code repo for the ACM MM paper: TransFusion: Multi-Modal Fusion for Video Tag Inference viaTranslation-based Knowledge Embedding.
9
star
45

BasicVQ-GEN

7
star
46

ArcVis

Visualization of 3d and 2d components interactively.
Jupyter Notebook
6
star
47

VTLayout

3
star