• Stars
    star
    778
  • Rank 58,431 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 3 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Anycost GAN

video | paper | website

Anycost GANs for Interactive Image Synthesis and Editing

Ji Lin, Richard Zhang, Frieder Ganz, Song Han, Jun-Yan Zhu

MIT, Adobe Research, CMU

In CVPR 2021

flexible

Anycost GAN generates consistent outputs under various computational budgets.

Demo

Here, we can use the Anycost generator for interactive image editing. A full generator takes ~3s to render an image, which is too slow for editing. While with Anycost generator, we can provide a visually similar preview at 5x faster speed. After adjustment, we hit the "Finalize" button to synthesize the high-quality final output. Check here for the full demo.

Overview

Anycost generators can be run at diverse computation costs by using different channel and resolution configurations. Sub-generators achieve high output consistency compared to the full generator, providing a fast preview.

overview

With (1) Sampling-based multi-resolution training, (2) adaptive-channel training, and (3) generator-conditioned discriminator, we achieve high image quality and consistency at different resolutions and channels.

method

Results

Anycost GAN (uniform channel version) supports 4 resolutions and 4 channel ratios, producing visually consistent images with different image fidelity.

uniform

The consistency retains during image projection and editing:

Usage

Getting Started

  • Clone this repo:
git clone https://github.com/mit-han-lab/anycost-gan.git
cd anycost-gan
  • Install PyTorch 1.7 and other dependeinces.

We recommend setting up the environment using Anaconda: conda env create -f environment.yml

Introduction Notebook

We provide a jupyter notebook example to show how to use the anycost generator for image synthesis at diverse costs: notebooks/intro.ipynb.

We also provide a colab version of the notebook: . Be sure to select the GPU as the accelerator in runtime options.

Interactive Demo

We provide an interactive demo showing how we can use anycost GAN to enable interactive image editing. To run the demo:

python demo.py

If your computer contains a CUDA GPU, try running with:

FORCE_NATIVE=1 python demo.py

You can find a video recording of the demo here.

Using Pre-trained Models

To get the pre-trained generator, encoder, and editing directions, run:

import models

pretrained_type = 'generator'  # choosing from ['generator', 'encoder', 'boundary']
config_name = 'anycost-ffhq-config-f'  # replace the config name for other models
models.get_pretrained(pretrained_type, config=config_name)

We also provide the face attribute classifier (which is general for different generators) for computing the editing directions. You can get it by running:

models.get_pretrained('attribute-predictor')

The attribute classifier takes in the face images in FFHQ format.

After loading the Anycost generator, we can run it at a wide range of computational costs. For example:

from models.dynamic_channel import set_uniform_channel_ratio, reset_generator

g = models.get_pretrained('generator', config='anycost-ffhq-config-f')  # anycost uniform
set_uniform_channel_ratio(g, 0.5)  # set channel
g.target_res = 512  # set resolution
out, _ = g(...)  # generate image
reset_generator(g)  # restore the generator

For detailed usage and flexible-channel anycost generator, please refer to notebooks/intro.ipynb.

Model Zoo

Currently, we provide the following pre-trained generators, encoders, and editing directions. We will add more in the future.

For Anycost generators, by default, we refer to the uniform setting.

config name generator encoder edit direction
anycost-ffhq-config-f โœ”๏ธ โœ”๏ธ โœ”๏ธ
anycost-ffhq-config-f-flexible โœ”๏ธ โœ”๏ธ โœ”๏ธ
anycost-car-config-f โœ”๏ธ
stylegan2-ffhq-config-f โœ”๏ธ โœ”๏ธ โœ”๏ธ

stylegan2-ffhq-config-f refers to the official StyleGAN2 generator converted from the repo.

Datasets

We prepare the FFHQ, CelebA-HQ, and LSUN Car datasets into a directory of images, so that it can be easily used with ImageFolder from torchvision. The dataset layout looks like:

โ”œโ”€โ”€ PATH_TO_DATASET
โ”‚ย ย  โ”œโ”€โ”€ images
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ 00000.png
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ 00001.png
โ”‚ย ย  โ”‚ย ย  โ”œโ”€โ”€ ...

Due to the copyright issue, you need to download the dataset from official site and process them accordingly.

Evaluation

We provide the code to evaluate some metrics presented in the paper. Some of the code is written with horovod to support distributed evaluation and reduce the cost of inter-GPU communication, which greatly improves the speed. Check its website for a proper installation.

Fre ฬchet Inception Distance (FID)

Before evaluating the FIDs, you need to compute the inception features of the real images using scripts like:

python tools/calc_inception.py \
    --resolution 1024 --batch_size 64 -j 16 --n_sample 50000 \
    --save_name assets/inceptions/inception_ffhq_res1024_50k.pkl \
    PATH_TO_FFHQ

or you can download the pre-computed inceptions from here and put it under assets/inceptions.

Then, you can evaluate the FIDs by running:

horovodrun -np N_GPU \
    python metrics/fid.py \
    --config anycost-ffhq-config-f \
    --batch_size 16 --n_sample 50000 \
    --inception assets/inceptions/inception_ffhq_res1024_50k.pkl
    # --channel_ratio 0.5 --target_res 512  # optionally using a smaller resolution/channel

Perceptual Path Lenght (PPL)

Similary, evaluting the PPL with:

horovodrun -np N_GPU \
    python metrics/ppl.py \
    --config anycost-ffhq-config-f

Attribute Consistency

Evaluating the attribute consistency by running:

horovodrun -np N_GPU \
    python metrics/attribute_consistency.py \
    --config anycost-ffhq-config-f \
    --channel_ratio 0.5 --target_res 512  # config for the sub-generator; necessary

Encoder Evaluation

To evaluate the performance of the encoder, run:

python metrics/eval_encoder.py \
    --config anycost-ffhq-config-f \
    --data_path PATH_TO_CELEBA_HQ

Training

We provide the scripts to train Anycost GAN on FFHQ dataset.

  • Training the original StyleGAN2 on FFHQ
horovodrun -np 8 bash scripts/train_stylegan2_ffhq.sh

The training of original StyleGAN2 is time-consuming. We recommend downloading the converted checkpoints from here and place it under checkpoint/.

  • Training Anycost GAN: mult-resolution
horovodrun -np 8 bash scripts/train_stylegan2_multires_ffhq.sh

Note that after each epoch, we evaluate the FIDs of two resolutions (1024&512) to better monitor the training progress. We also apply distillation to accelearte the convergence, which is not used for the ablation in the paper.

  • Training Anycost GAN: adaptive-channel
horovodrun -np 8 bash scripts/train_stylegan2_multires_adach_ffhq.sh

Here we set a longer training epoch for a more stable reproduction, which might not be necessary (depending on the randomness).

Note: We trained our models on Titan RTX GPUs with 24GB memory. For GPUs with smaller memory, you may need to reduce the resolution/model size/batch size/etc. and adjust other hyper-parameters accordingly.

Citation

If you use this code for your research, please cite our paper.

@inproceedings{lin2021anycost,
  author    = {Lin, Ji and Zhang, Richard and Ganz, Frieder and Han, Song and Zhu, Jun-Yan},
  title     = {Anycost GANs for Interactive Image Synthesis and Editing},
  booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year      = {2021},
}

Related Projects

GAN Compression | Once for All | iGAN | StyleGAN2

Acknowledgement

We thank Taesung Park, Zhixin Shu, Muyang Li, and Han Cai for the helpful discussion. Part of the work is supported by NSF CAREER Award #1943349, Adobe, SONY, Naver Corporation, and MIT-IBM Watson AI Lab.

The codebase is build upon a PyTorch implementation of StyleGAN2: rosinality/stylegan2-pytorch. For editing direction extraction, we refer to InterFaceGAN.

More Repositories

1

streaming-llm

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks
Python
6,530
star
2

bevfusion

[ICRA'23] BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation
Python
2,286
star
3

temporal-shift-module

[ICCV 2019] TSM: Temporal Shift Module for Efficient Video Understanding
Python
2,060
star
4

once-for-all

[ICLR 2020] Once for All: Train One Network and Specialize it for Efficient Deployment
Python
1,866
star
5

llm-awq

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Python
1,687
star
6

proxylessnas

[ICLR 2019] ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware
C++
1,420
star
7

torchquantum

A PyTorch-based framework for Quantum Classical Simulation, Quantum Machine Learning, Quantum Neural Networks, Parameterized Quantum Circuits with support for easy deployments on real quantum computers.
Jupyter Notebook
1,304
star
8

data-efficient-gans

[NeurIPS 2020] Differentiable Augmentation for Data-Efficient GAN Training
Python
1,277
star
9

efficientvit

EfficientViT is a new family of vision models for efficient high-resolution vision.
Python
1,218
star
10

torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
Cuda
1,181
star
11

smoothquant

[ICML 2023] SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Python
1,175
star
12

gan-compression

[CVPR 2020] GAN Compression: Efficient Architectures for Interactive Conditional GANs
Python
1,104
star
13

tinyml

Python
755
star
14

TinyChatEngine

TinyChatEngine: On-Device LLM Inference Library
C++
730
star
15

tinyengine

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning; [NeurIPS 2022] MCUNetV3: On-Device Training Under 256KB Memory
C
717
star
16

fastcomposer

[IJCV] FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Python
644
star
17

pvcnn

[NeurIPS 2019, Spotlight] Point-Voxel CNN for Efficient 3D Deep Learning
Python
639
star
18

lite-transformer

[ICLR 2020] Lite Transformer with Long-Short Range Attention
Python
589
star
19

spvnas

[ECCV 2020] Searching Efficient 3D Architectures with Sparse Point-Voxel Convolution
Python
577
star
20

distrifuser

[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
Python
538
star
21

mcunet

[NeurIPS 2020] MCUNet: Tiny Deep Learning on IoT Devices; [NeurIPS 2021] MCUNetV2: Memory-Efficient Patch-based Inference for Tiny Deep Learning
Python
460
star
22

tiny-training

On-Device Training Under 256KB Memory [NeurIPS'22]
Python
432
star
23

amc

[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Python
428
star
24

dlg

[NeurIPS 2019] Deep Leakage From Gradients
Python
400
star
25

haq

[CVPR 2019, Oral] HAQ: Hardware-Aware Automated Quantization with Mixed Precision
Python
368
star
26

offsite-tuning

Offsite-Tuning: Transfer Learning without Full Model
Python
365
star
27

hardware-aware-transformers

[ACL'20] HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
Python
321
star
28

litepose

[CVPR'22] Lite Pose: Efficient Architecture Design for 2D Human Pose Estimation
Python
304
star
29

inter-operator-scheduler

[MLSys 2021] IOS: Inter-Operator Scheduler for CNN Acceleration
C++
191
star
30

amc-models

[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Python
166
star
31

apq

[CVPR 2020] APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
Python
156
star
32

parallel-computing-tutorial

C++
134
star
33

flatformer

[CVPR'23] FlatFormer: Flattened Window Attention for Efficient Point Cloud Transformer
Python
119
star
34

patch_conv

Patch convolution to avoid large GPU memory usage of Conv2D
Python
74
star
35

6s965-fall2022

Jupyter Notebook
64
star
36

sparsevit

[CVPR'23] SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer
Python
48
star
37

bnn-icestick

Binary Neural Network on IceStick FPGA.
Jupyter Notebook
47
star
38

e3d

Efficient 3D Deep Learning
46
star
39

neurips-micronet

[JMLR'20] NeurIPS 2019 MicroNet Challenge Efficient Language Modeling, Champion
Jupyter Notebook
40
star
40

spatten-llm

[HPCA'21] SpAtten: Efficient Sparse Attention Architecture with Cascade Token and Head Pruning
Scala
32
star
41

tinychat-tutorial

C++
28
star
42

pruning-sparsity-publications

14
star
43

iccad-tinyml-open

[ICCAD'22 TinyML Contest] Efficient Heart Stroke Detection on Low-cost Microcontrollers
C
14
star
44

calo-cluster

Jupyter Notebook
5
star
45

ml-blood-pressure

Python
5
star
46

gan-compression-dynamic

Python
3
star
47

data-efficient-gans-dynamic

Python
3
star