• Stars
    star
    876
  • Rank 52,107 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created about 3 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[NeurIPS'21] Projected GANs Converge Faster

[Project] [PDF] [Supplementary] [Talk] [CGP Summary] [Replicate Demo] [Hugging Face Spaces Demo]

For a quick start, try the Colab:   Projected GAN Quickstart

This repository contains the code for our NeurIPS 2021 paper "Projected GANs Converge Faster"

by Axel Sauer, Kashyap Chitta, Jens Müller, and Andreas Geiger.

If you find our code or paper useful, please cite

@InProceedings{Sauer2021NEURIPS,
  author         = {Axel Sauer and Kashyap Chitta and Jens M{\"{u}}ller and Andreas Geiger},
  title          = {Projected GANs Converge Faster},
  booktitle      = {Advances in Neural Information Processing Systems (NeurIPS)},
  year           = {2021},
}

Related Projects

ToDos

Requirements

  • 64-bit Python 3.8 and PyTorch 1.9.0 (or later). See https://pytorch.org for PyTorch install instructions.
  • Use the following commands with Miniconda3 to create and activate your PG Python environment:
    • conda env create -f environment.yml
    • conda activate pg
  • The StyleGAN2 generator relies on custom CUDA kernels, which are compiled on the fly. Hence you need:
    • CUDA toolkit 11.1 or later.
    • GCC 7 or later compilers. Recommended GCC version depends on CUDA version, see for example CUDA 11.4 system requirements.
    • If you run into problems when setting up for the custom CUDA kernels, we refer to the Troubleshooting docs of the original StyleGAN repo. When using the FastGAN generator you will not need the custom kernels.

Data Preparation

For a quick start, you can download the few-shot datasets provided by the authors of FastGAN. You can download them here. To prepare the dataset at the respective resolution, run for example

python dataset_tool.py --source=./data/pokemon --dest=./data/pokemon256.zip \
  --resolution=256x256 --transform=center-crop

You can get the datasets we used in our paper at their respective websites:

CLEVR, FFHQ, Cityscapes, LSUN, AFHQ, Landscape.

Training

Training your own PG on LSUN church using 8 GPUs:

python train.py --outdir=./training-runs/ --cfg=fastgan --data=./data/pokemon256.zip \
  --gpus=8 --batch=64 --mirror=1 --snap=50 --batch-gpu=8 --kimg=10000

--batch specifies the overall batch size, --batch-gpu specifies the batch size per GPU. If you use fewer GPUs, the training loop will automatically accumulate gradients, until the overall batch size is reached.

If you want to use the StyleGAN2 generator, pass --cfg=stylegan2. We also added a lightweight version of FastGAN (--cfg=fastgan_lite). This backbone trains fast regarding wallclock time and yields better results on small datasets like Pokemon. Samples and metrics are saved in outdir. To monitor the training progress, you can inspect fid50k_full.json or run tensorboard in training-runs.

Generating Samples & Interpolations

To generate samples and interpolation videos, run

python gen_images.py --outdir=out --trunc=1.0 --seeds=10-15 \
  --network=PATH_TO_NETWORK_PKL

and

python gen_video.py --output=lerp.mp4 --trunc=1.0 --seeds=0-31 --grid=4x2 \
  --network=PATH_TO_NETWORK_PKL

We provide the following pretrained models (pass the url as PATH_TO_NETWORK_PKL):

https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/art_painting.pkl
https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/church.pkl
https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/bedroom.pkl
https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/cityscapes.pkl
https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/clevr.pkl
https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/ffhq.pkl
https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/flowers.pkl
https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/landscape.pkl
https://s3.eu-central-1.amazonaws.com/avg-projects/projected_gan/models/pokemon.pkl

Quality Metrics

Per default, train.py tracks FID50k during training. To calculate metrics for a specific network snapshot, run

python calc_metrics.py --metrics=fid50k_full --network=PATH_TO_NETWORK_PKL

To see the available metrics, run

python calc_metrics.py --help

Using PG in your own project

Our implementation is modular, so it is straightforward to use PG in your own codebase. Simply copy the pg_modules folder to your project. Then, to get the projected multi-scale discriminator, run

from pg_modules.discriminator import ProjectedDiscriminator
D = ProjectedDiscriminator()

The only thing you still need to do is to make sure that the feature network is not trained, i.e., explicitly set

D.feature_network.requires_grad_(False)

in your training loop.

Acknowledgments

Our codebase build and extends the awesome StyleGAN2-ADA repo and StyleGAN3 repo, both by Karras et al.

Furthermore, we use parts of the code of FastGAN and MiDas.

More Repositories

1

sdfstudio

A Unified Framework for Surface Reconstruction
Python
1,965
star
2

occupancy_networks

This repository contains the code for the paper "Occupancy Networks - Learning 3D Reconstruction in Function Space"
Python
1,492
star
3

giraffe

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"
Python
1,227
star
4

stylegan-t

[ICML'23] StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Python
1,122
star
5

mip-splatting

[CVPR'24 Best Student Paper] Mip-Splatting: Alias-free 3D Gaussian Splatting
Python
1,046
star
6

transfuser

[PAMI'23] TransFuser: Imitation with Transformer-Based Sensor Fusion for Autonomous Driving; [CVPR'21] Multi-Modal Fusion Transformer for End-to-End Autonomous Driving
Python
1,023
star
7

stylegan-xl

[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets
Python
939
star
8

unimatch

[TPAMI'23] Unifying Flow, Stereo and Depth Estimation
Python
855
star
9

convolutional_occupancy_networks

[ECCV'20] Convolutional Occupancy Networks
Python
792
star
10

differentiable_volumetric_rendering

This repository contains the code for the CVPR 2020 paper "Differentiable Volumetric Rendering: Learning Implicit 3D Representations without 3D Supervision"
Python
782
star
11

gaussian-opacity-fields

[SIGGRAPH Asia'24 & TOG] Gaussian Opacity Fields: Efficient Adaptive Surface Reconstruction in Unbounded Scenes
Python
705
star
12

monosdf

[NeurIPS'22] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction
Python
563
star
13

shape_as_points

[NeurIPS'21] Shape As Points: A Differentiable Poisson Solver
Python
518
star
14

tuplan_garage

[CoRL'23] Parting with Misconceptions about Learning-based Vehicle Motion Planning
Python
499
star
15

unisurf

[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
Python
418
star
16

graf

Official code release for "GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis"
Jupyter Notebook
398
star
17

kitti360Scripts

This repository contains utility scripts for the KITTI-360 dataset.
Python
385
star
18

neat

[ICCV'21] NEAT: Neural Attention Fields for End-to-End Autonomous Driving
Python
301
star
19

navsim

[NeurIPS 2024] NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking
Python
244
star
20

occupancy_flow

This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"
Python
207
star
21

plant

[CoRL'22] PlanT: Explainable Planning Transformers via Object-Level Representations
Python
201
star
22

factor-fields

[SIGGRAPH 2023] We provide a unified formula for neural fields (Factor Fields) and a novel dictionary factorization (Dictionary Fields)
Jupyter Notebook
183
star
23

sledge

[ECCV'24] SLEDGE: Synthesizing Driving Environments with Generative Models and Rule-Based Traffic
Python
151
star
24

voxgraf

Official code release for VoxGRAF: Fast 3D-Aware Image Synthesis with Sparse Voxel Grids
Python
128
star
25

carla_garage

[ICCV'23] Hidden Biases of End-to-End Driving Models
Python
121
star
26

gta

[ICLR'24] GTA: A Geometry-Aware Attention Mechanism for Multi-view Transformers
Python
121
star
27

texture_fields

This repository contains code for the paper 'Texture Fields: Learning Texture Representations in Function Space'.
Python
115
star
28

kitti360LabelTool

JavaScript
103
star
29

counterfactual_generative_networks

[ICLR'21] Counterfactual Generative Networks
Python
102
star
30

murf

[CVPR'24] MuRF: Multi-Baseline Radiance Fields
Python
84
star
31

king

[ECCV'22] KING: Generating Safety-Critical Driving Scenarios for Robust Imitation via Kinematics Gradients
Python
73
star
32

controllable_image_synthesis

Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis, CVPR 2020
Python
70
star
33

handheld_svbrdf_geometry

On Joint Estimation of Pose, Geometry and svBRDF from a Handheld Scanner, CVPR2020
Python
59
star
34

connecting_the_dots

This repository contains the code for the paper "Connecting the Dots: Learning Representations for Active Monocular Depth Estimation" https://avg.is.tuebingen.mpg.de/publications/riegler2019cvpr
Python
56
star
35

frequency_bias

Official code for "On the Frequency Bias of Generative Models", NeurIPS 2021
Python
45
star
36

good

[ICLR'23] GOOD: Exploring Geometric Cues for Detecting Objects in an Open World
Python
39
star
37

data_aggregation

This repository contains the code for the CVPR 2020 paper "Exploring Data Aggregation in Policy Learning for Vision-based Urban Autonomous Driving"
Python
38
star
38

campari

[3DV'21] CAMPARI: Camera-Aware Decomposed Generative Neural Radiance Fields
Python
29
star
39

akorn

Reproducing code for the work: Artificial Kuramoto Oscillatory Neurons
22
star
40

autonomousvision.github.io

Blog of the Autonomous Vision Group at MPI-IS Tübingen and University of Tübingen.
HTML
19
star
41

hdt

[COLM'24] HDT: Hierarchical Document Transformer
Python
7
star
42

visual_abstractions

6
star
43

slides

Slide repository of the Autonomous Vision Group at MPI-IS Tübingen and University of Tübingen.
CSS
2
star
44

similarity_reconstruction

This code is based on the paper Exploiting Object Similarity in 3D Reconstruction.
C++
1
star
45

slow_flow

This code is based on the paper Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data.
C++
1
star