• Stars
    star
    229
  • Rank 174,666 (Top 4 %)
  • Language
    Python
  • Created almost 4 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[CVPR 2021] Adversarial Generation of Continuous Images

Adversarial Generation of Continuous Images [CVPR 2021]

This repo contains INR-GAN implementation built on top of the StyleGAN2-ADA repo. Compared to a traditional convolutional generator, ours is INR-based, i.e. it produces parameters for a fully-connected neural network which generates pixel values independently based on their coordinate positions (see the illustration below).

INR-GAN illustration

Performance

We provide the checkpoints of our model with the following FID scores. See Pretrained checkpoints to download them.

Model LSUN Churches 256x256 LSUN Bedroom 256x256 FFHQ 256x256 #imgs/sec on V100 32gb Memory usage
INR-GAN 4.45 5.71 9.57 266.45 @ batch_size=50 23.54 Gb @ batch_size=50
INR-GAN-bil* 4.04 3.43 4.95 209.16 @ batch_size=50 23.56 Gb @ batch_size=50
StyleGAN2 3.86 2.65 3.83 95.79 @ batch_size=32 3.65 Gb @ batch_size=32
CIPS 2.92 - 4.38 27.27 @ batch_size=16 8.11 Gb @ batch_size=16

*INR-GAN-bil model uses bilinear interpolation (and instance norm) which "deviates" from the INR "paradigm" because pixels are now generated non-independently. However, it still uses only fully-connected layers (i.e. no convolutions) to generate an image.

The inference speed in terms of #imgs/sec was measured on a single NVidia V100 GPU (32 Gb) without using the mixed precision (see the profiling section below).

Note: our CIPS implementation is not exact. See CIPS for the exact one.

For INR-GAN, memory usage is increased for 2 reasons:

  • we use coordinate embeddings for high-resolutions
  • we cache coordinate embeddings at test time (when they do not depend on z)

Note that the profiling results can differ depending on the hardware and drivers installed (we used CUDA 10.1.243).

Installation

To install, run the following command:

conda env create --file environment.yaml --prefix ./env
conda activate ./env

Training

To train the model, navigate to the project directory and run:

python src/infra/launch_local.py hydra.run.dir=. +experiment_name=my_experiment_name +dataset.name=dataset_name num_gpus=4

where dataset_name is the name of the dataset without .zip extension inside data/ directory (you can easily override the paths in configs/main.yml). So make sure that data/dataset_name.zip exists and should be a plain directory of images. See StyleGAN2-ADA repo for additional data format details. This training command will create an experiment inside experiments/ directory and will copy the project files into it. This is needed to isolate the code which produces the model.

Pretrained checkpoints

INR-GAN checkpoints:

For Churches, the model works well without additional convolutions on top of 128x128 and 256x256 blocks, that's why we do not use them for this dataset (i.e. extra_convs: {} in the inr-gan.yml config) which makes it run in 301.69 imgs/second. We believe that the reason why it works better on Churches compared to other datasets is that this dataset contains more high-frequency details.

INR-GAN-bil checkpoints:

Data format

We use the same data format as the original StyleGAN2-ADA repo: it is a zip of images. It is assumed that all data is located in a single directory, specified in configs/main.yml.

For completeness, we also provide downloadable links to the datasets:

Download the datasets and put them into data/ directory.

Profiling

To profile the model, run:

CUDA_VISIBLE_DEVICES=0 python src/scripts/profile.py hydra.run.dir=. model=inr-gan.yml

The inference speed in terms of #imgs/sec was measured on a single NVidia V100 GPU (32 Gb). Note, that this model was developed before StyleGAN2-ADA, i.e. before mixed precision was a thing. With mixed precision enabled, StyleGAN2 produced 256.88 #imgs/sec @ batch_size=128. INR-GAN (default architecture) with mixed precision gives only 465.60 #imgs/sec @ batch_size=100 (only 50% speed increase compared to its full-precision version) and we didn't try training it (performance might drop). We also compared to CIPS (which is a parallel work that explores INR-based generation) in terms of speed (didn't try training it). For all the models, we used the optimal batch size unique for them.

License

This repo is built on top of StyleGAN2-ADA, so I assume it is restricted by the NVidia license (though I am not a lawyer).

Bibtex

@article{inr_gan,
    title={Adversarial Generation of Continuous Images},
    author={Ivan Skorokhodov and Savva Ignatyev and Mohamed Elhoseiny},
    journal={arXiv preprint arXiv:2011.12026},
    year={2020}
}

@article{cips,
    title={Image Generators with Conditionally-Independent Pixel Synthesis},
    author={Anokhin, Ivan and Demochkin, Kirill and Khakhulin, Taras and Sterkin, Gleb and Lempitsky, Victor and Korzhenkov, Denis},
    journal={arXiv preprint arXiv:2011.13775},
    year={2020}
}

More Repositories

1

stylegan-v

[CVPR 2022] StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
Python
338
star
2

alis

[ICCV 2021] Aligning Latent and Image Spaces to Connect the Unconnectable
Jupyter Notebook
238
star
3

epigraf

[NeurIPS 2022] Official pytorch implementation of EpiGRAF
Python
150
star
4

loss-patterns

Loss Patterns of Neural Networks
Python
82
star
5

fvd-comparison

Comparison between Frechet Video Distance implementation from StyleGAN-V and the original paper
Python
73
star
6

class-norm

Class Normalization for Continual Zero-Shot Learning
Python
34
star
7

firelab

Experimental framework for running pytorch experiments
Python
14
star
8

megascans-rendering

Rendering code for Megascans from "EpiGRAF: Rethinking training of 3D GANs" [NeurIPS 2022]
Python
11
star
9

dl-reasoner

Tableau-based reasoner for ALCQ description logic
Rust
11
star
10

non-uniform-interpolation

Differentiable non-uniform interpolation: https://arxiv.org/abs/2012.13257
Cuda
9
star
11

omniplan

Omniplan web view in Node.js and React
JavaScript
8
star
12

rtrs

Simple ray tracing and rasterization engine
Rust
5
star
13

presentations

TeX
4
star
14

hasheroku

Hash strings into Heroku-like names
Python
4
star
15

human-pose

Human Pose experiments
Jupyter Notebook
3
star
16

aladdin

A betting arbitrage bot
Rust
3
star
17

text-vae

Experiments with text VAE
Jupyter Notebook
2
star
18

text-style-transfer

Experiments with style transfer in text
Python
2
star
19

transformer-lm

Transformer Language Model written in pytorch v1.1.0 with telegram integration
Python
1
star
20

model-zoo

pytorch models
Python
1
star