• Stars
    star
    126
  • Rank 283,524 (Top 6 %)
  • Language
    Python
  • License
    Other
  • Created almost 3 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

CVPR 2022 VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

VolumeGAN - 3D-aware Image Synthesis via Learning Structural and Textural Representations

image Figure: Framework of VolumeGAN.

3D-aware Image Synthesis via Learning Structural and Textural Representations
Yinghao Xu, Sida Peng, Ceyuan Yang, Yujun Shen, Bolei Zhou
Computer Vision and Pattern Recognition (CVPR), 2022

[Paper] [Project Page] [Demo]

This paper aims at achieving high-fidelity 3D-aware images synthesis. We propose a novel framework, termed as VolumeGAN, for synthesizing images under different camera views, through explicitly learning a structural representation and a textural representation. We first learn a feature volume to represent the underlying structure, which is then converted to a feature field using a NeRF-like model. The feature field is further accumulated into a 2D feature map as the textural representation, followed by a neural renderer for appearance synthesis. Such a design enables independent control of the shape and the appearance. Extensive experiments on a wide range of datasets show that our approach achieves sufficiently higher image quality and better 3D control than the previous methods.

Usage

Setup

This repository is based on Hammer, where you can find detailed instructions on environmental setup.

Test Demo

python render.py \
    --work_dir ${WORK_DIR} \
    --checkpoint ${MODEL_PATH} \
    --num ${NUM} \
    --seed ${SEED} \
    --render_mode ${RENDER_MODE} \
    --generate_html ${SAVE_HTML} \
    volumegan-ffhq

where

  • WORK_DIR refers to the path to save the results.
  • MODEL_PATH refers to the path of the pretrained model, regarding which we provide
  • NUM refers to the number of samples to synthesize.
  • SEED refers to the random seed used for sampling.
  • RENDER_MODE refers to the type of the rendered results, including video and shape.
  • SAVE_HTML controls whether to save images as an HTML for better visualization when rendering videos.

Training

For example, users can use the following command to train VolumeGAN on FFHQ in the resolution of 256x256

./scripts/training_demos/volumegan_ffhq256.sh \
    ${NUM_GPUS} \
    ${DATA_PATH} \
    [OPTIONS]

where

  • NUM_GPUS refers to the number of GPUs used for training.
  • DATA_PATH refers to the path to the dataset (zip format is strongly recommended).
  • [OPTIONS] refers to any additional option to pass. Detailed instructions on available options can be found via python train.py volumegan-ffhq --help.

NOTE: This demo script uses volumegan_ffhq256 as the default job_name, which is particularly used to identify experiments. Concretely, a directory with name job_name will be created under the root working directory, which is set as work_dirs/ by default. To prevent overwriting previous experiments, an exception will be raised to interrupt the training if the job_name directory has already existed. Please use --job_name=${JOB_NAME} option to specify a new job name.

Evaluation

Users can use the following command to evaluate a well-trained model

./scripts/test_metrics.sh \
    ${NUM_GPUS} \
    ${DATA_PATH} \
    ${MODEL_PATH} \
    fid \
    --G_kwargs '{"ps_kwargs":'{"perturb_mode":"none"}'}' \
    [OPTIONS]

BibTeX

@inproceedings{xu2021volumegan,
  title     = {3D-aware Image Synthesis via Learning Structural and Textural Representations},
  author    = {Xu, Yinghao and Peng, Sida and Yang, Ceyuan and Shen, Yujun and Zhou, Bolei},
  booktitle = {CVPR},
  year      = {2022}
}

More Repositories

1

interfacegan

[CVPR 2020] Interpreting the Latent Space of GANs for Semantic Face Editing
Python
1,483
star
2

sefa

[CVPR 2021] Closed-Form Factorization of Latent Semantics in GANs
Python
960
star
3

genforce

An efficient PyTorch library for deep generative modeling.
Python
907
star
4

idinvert

[ECCV 2020] In-Domain GAN Inversion for Real Image Editing
Python
459
star
5

idinvert_pytorch

[ECCV 2020] In-Domain GAN Inversion for Real Image Editing (PyTorch code)
Python
407
star
6

freecontrol

Official implementation of CVPR 2024 paper: "FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition"
Python
383
star
7

mganprior

[CVPR 2020] Image Processing Using Multi-Code GAN Prior
Python
288
star
8

higan

[IJCV 2020] Semantic Hierarchy Emerges in Deep Generative Representations for Scene Synthesis
Python
159
star
9

eqgan-sa

[CVPR 2022] Improving GAN Equilibrium by Raising Spatial Awareness
Python
156
star
10

ghfeat

[CVPR 2021] Generative Hierarchical Features from Synthesizing Images
Python
156
star
11

insgen

[NeurIPS 2021] Data-Efficient Instance Generation from Instance Discrimination
Python
101
star
12

lia

[IJCV 2022] Disentangled Inference for GANs with Latently Invertible Autoencoder
Python
90
star
13

StyleSV

[ICLR 2023] Towards Smooth Video Composition
Python
83
star
14

dynamicd

[NeurIPS 2022] Improving GANs with A Dynamic Discriminator
Python
63
star
15

genda

[ICCV 2023] One-Shot Generative Domain Adaptation
56
star
16

trgan

Unsupervised Image Transformation Learning via Generative Adversarial Networks
32
star
17

fairgen

Code for paper `Improving the Fairness of Deep Generative Models without Retraining`
Python
29
star
18

SpatialGAN

Spatial Steerability of GANs via Self-Supervision from Discriminator
8
star
19

genforce.github.io

Homepage.
HTML
6
star