• Stars
    star
    296
  • Rank 140,464 (Top 3 %)
  • Language
    Jupyter Notebook
  • Created almost 4 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Navigating the GAN Parameter Space for Semantic Image Editing

Navigating the GAN Parameter Space for Semantic Image Editing

Authors official implementation of the CVPR'2021 paper Navigating the GAN Parameter Space for Semantic Image Editing by Anton Cherepkov, Andrey Voynov, and Artem Babenko.

Open In Colab

Main steps of our approach: An image

  • First: we form a low-dimensional subspace in the parameters space of a pretrained GAN;
  • Second: we solve an optimization problem to discover interpretable controls in this subspace.

Typical Visual Effects

FFHQ

original
An image
nose length
An image
eyes size
An image
eyes direction
An image
brows up
An image
vampire
An image

LSUN-Car

An image
Wheels size

LSUN-Church

An image
Add conic structures

LSUN-Horse

An image
Thickness

Real Images Domain

An image
An image
An image
An image

An image An image An image

Pix2PixHD

The proposed method is also applicable to pixel-to-pixel models. Here we present some of the effects discovered for the label-to-streetview model.

Alt Text
Add curb

Alt Text
Road darkening

check high-res videos here: curb1, curb2, darkening1, darkening2


Training

There are two options to form the low-dimensional parameters subspace: LPIPS-Hessian-based and SVD-based. The first one is recommended.

LPIPS-Hessian-based

Once you want to use the LPIPS-Hessian, first run its computation:

  • Calculating hessian's eigenvectors
python hessian_power_iteration.py \
    --out result \                             #  script output
    --gan_weights stylegan2-car-config-f.pt \  #  model weigths
    --resolution 512 \                         #  model resolution
    --gan_conv_layer_index 3 \                 #  target convolutional layer index starting from 0
    --num_samples 512 \                        #  z-samples count to use for hessian computation
    --num_eigenvectors 64 \                    #  number of leading eigenvectors to calculate

Second, run the interpretable directions search:

  • Interpretable directions in the hessian's eigenvectors subspace
python run_train.py \
    --out results \                           #  script out
    --gan_type StyleGAN2 \                    #  currently only StyleGAN2 is available
    --gan_weights stylegan2-car-config-f.pt \
    --resolution 512 \
    --shift_predictor_size 256 \              # resize to 256 before shift prediction [memory saving-option]
    --deformator_target weight_fixedbasis \
    --basis_vectors_path eigenvectors_layer3_stylegan2-car-config-f.pt \  # first step results
    --deformator_conv_layer_index 3 \         # should be the same as on the first step
    --directions_count 64 \
    --shift_scale 60 \
    --min_shift 15 \

SVD-based

The second option is to run the search over the SVD-based basis:

python run_train.py \
    --out results \
    --gan_type StyleGAN2 \
    --gan_weights stylegan2-car-config-f.pt \
    --resolution 512 \
    --shift_predictor_size 256 \
    --deformator_target weight_svd \
    --deformator_conv_layer_index 3 \  #  target convolutional layer index starting from 0
    --directions_count 64 \
    --shift_scale 3500 \
    --shift_weight 0.0025 \
    --min_shift 300 \

Though we successfully use the same shift_scale for different layers, its manual per-layer tuning can slightly improve performance.


Evaluation

Here we present the code to visualize controls discovered by the previous steps for:

  • SVD;
  • SVD with optimization (optimization-based);
  • Hessian (spectrum-based);
  • Hessian with optimization (hybrid)

First, import the required modules and load the generator:

from inference import GeneratorWithWeightDeformator
from loading import load_generator

G = load_generator(
    args={'resolution': 512, 'gan_type': 'StyleGAN2'},
    G_weights='stylegan2-car-config-f.pt'
)

Second, modify the GAN parameters using one of the methods below.

SVD-based
G = GeneratorWithWeightDeformator(
    generator=G,
    deformator_type='svd',
    layer_ix=3,
)
Optimization in the SVD basis
G = GeneratorWithWeightDeformator(
    generator=G,
    deformator_type='svd_rectification',
    layer_ix=3,
    checkpoint_path='_svd_based_train_/checkpoint.pt',
)
Hessian's eigenvectors
G = GeneratorWithWeightDeformator(
    generator=G,
    deformator_type='hessian',
    layer_ix=3,
    eigenvectors_path='eigenvectors_layer3_stylegan2-car-config-f.pt'
)
Optimization in the Hessian eigenvectors basis
G = GeneratorWithWeightDeformator(
    generator=G,
    deformator_type='hessian_rectification',
    layer_ix=3,
    checkpoint_path='_hessian_based_train_/checkpoint.pt',
    eigenvectors_path='eigenvectors_layer3_stylegan2-car-config-f.pt'
)

Now you can apply modified parameters for every element in the batch in the following manner:

# Generate some samples
zs = torch.randn((4, 512)).cuda()

# Specify deformation index and shift
direction = 0
shift = 100.0
G.deformate(direction, shift)

# Simply call the generator
imgs_deformated = G(zs)

Saving into a file

You can save the discovered parameters shifts (including layer_ix and data) into a file. In order to do this:

  1. Modify the GAN parameters in the manner described above;
  2. Call G.save_deformation(path, direction_ix).

Loading from file

First, import the required modules and load the generator:

from inference import GeneratorWithFixedWeightDeformation
from loading import load_generator

G = load_generator(
    args={'resolution': 512, 'gan_type': 'StyleGAN2'},
    G_weights='stylegan2-car-config-f.pt'
)

Second, modify the GAN:

G = GeneratorWithFixedWeightDeformation(generator=G, deformation_path='deformation.pt')

Now you can apply modified parameters for every element in the batch in the following manner:

# Generate some samples
zs = torch.randn((4, 512)).cuda()

# Deformate; G.scale is a recommended scale
G.deformate(0.5 * G.scale)

# Simply call the generator
imgs_deformated = G(zs)

Pretrained directions

Annotated generators directions and gif examples sources:
FFHQ: https://www.dropbox.com/s/7m838ewhzgcb3v5/ffhq_weights_deformations.tar
Car: https://www.dropbox.com/s/rojdcfvnsdue10o/car_weights_deformations.tar
Horse: https://www.dropbox.com/s/ir1lg5v2yd4cmkx/horse_weights_deformations.tar
Church: https://www.dropbox.com/s/do9yt3bggmggehm/church_weights_deformations.tar

StyleGAN2 weights: https://www.dropbox.com/s/d0aas2fyc9e62g5/stylegan2_weights.tar
generators weights are the original models weights converted to pytorch (see credits)

You can find loading and deformation example at example.ipynb


Citation

@InProceedings{Navigan_CVPR_2021,
    author    = {Cherepkov, Anton and Voynov, Andrey and Babenko, Artem},
    title     = {Navigating the GAN Parameter Space for Semantic Image Editing},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {3671-3680}
}

Credits

Our code is based on the Unsupervised Discovery of Interpretable Directions in the GAN Latent Space official implementation
https://github.com/anvoynov/GANLatentDiscovery
Generator model is implemented over the StyleGAN2-pytorch:
https://github.com/rosinality/stylegan2-pytorch
Generators weights were converted from the original StyleGAN2:
https://github.com/NVlabs/stylegan2

More Repositories

1

rtdl

Research on Tabular Deep Learning: Papers & Packages
Python
874
star
2

ddpm-segmentation

Label-Efficient Semantic Segmentation with Diffusion Models (ICLR'2022)
Python
657
star
3

tab-ddpm

[ICML 2023] The official implementation of the paper "TabDDPM: Modelling Tabular Data with Diffusion Models"
Python
375
star
4

rtdl-num-embeddings

(NeurIPS 2022) On Embeddings for Numerical Features in Tabular Deep Learning
Python
302
star
5

tabular-dl-tabr

The implementation of "TabR: Unlocking the Power of Retrieval-Augmented Tabular Deep Learning"
Python
258
star
6

rtdl-revisiting-models

(NeurIPS 2021) Revisiting Deep Learning Models for Tabular Data
Python
206
star
7

swarm

Official code for "SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient"
Python
123
star
8

DeDLOC

Official code for "Distributed Deep Learning in Open Collaborations" (NeurIPS 2021)
Jupyter Notebook
115
star
9

RuLeanALBERT

RuLeanALBERT is a pretrained masked language model for the Russian language that uses a memory-efficient architecture.
Python
90
star
10

heterophilous-graphs

A Critical Look at the Evaluation of GNNs under Heterophily: Are We Really Making Progress?
Python
89
star
11

invertible-cd

[NeurIPS'2024] Invertible Consistency Distillation for Text-Guided Image Editing in Around 7 Steps
Python
82
star
12

GBDT-uncertainty

Jupyter Notebook
51
star
13

graph-glove

PyTorch code for the EMNLP 2020 paper "Embedding Words in Non-Vector Space with Unsupervised Graph Learning"
Python
40
star
14

specexec

Python
38
star
15

tabred

A Benchmark of Tabular Machine Learning in-the-Wild with real-world industry-grade tabular datasets
Python
37
star
16

DVAR

Official implementation of "Is This Loss Informative? Faster Text-to-Image Customization by Tracking Objective Dynamics" (NeurIPS 2023)
Python
36
star
17

sparqling-queries

This repo in the implementation of EMNLP'21 paper "SPARQLing Database Queries from Intermediate Question Decompositions" by Irina Saparina, Anton Osokin
Python
34
star
18

moshpit-sgd

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation
Jupyter Notebook
28
star
19

adaptive-diffusion

[CVPR'2024] Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models
Python
28
star
20

gan-transfer

Supplementary code for "When, Why, and Which Pretrained GANs Are Useful?" (ICLR'22)
Jupyter Notebook
24
star
21

vqdm

Official repository for VQDM:Accurate Compression of Text-to-Image Diffusion Models via Vector Quantization paper
Python
18
star
22

btard

Code for the paper "Secure Distributed Training at Scale" (ICML 2022)
Python
14
star
23

structural-graph-shifts

Evaluating Robustness and Uncertainty of Graph Models Under Structural Distributional Shifts (NeurIPS'23)
Python
11
star
24

crosslingual_winograd

"It's All in the Heads" (Findings of ACL 2021), official implementation and data
Python
10
star
25

gan_vs_diff_sr

Does Diffusion Beat GAN in Image Super Resolution?
10
star
26

distill-nf

Code for the paper: Distilling the Knowledge from Conditional Normalizing Flows
Jupyter Notebook
9
star
27

classification-measures

Official implementation and data for 'Good Classification Measures and How to Find Them' (NeurIPS 2021)
Python
7
star
28

text-to-img-hypernymy

Official code for "Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet Hierarchy"
Jupyter Notebook
6
star
29

tabm

TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling
Python
6
star
30

dnar

The implementation of "Discrete Neural Algorithmic Reasoning"
Python
6
star
31

learnable-init

Code for the paper: Discovering Weight Initializers with Meta-Learning
Jupyter Notebook
5
star
32

mind-your-format

Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements
Jupyter Notebook
5
star
33

proxy-dirichlet-distillation

Implementation of "Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets" (NeurIPS 2021) and "Uncertainty Estimation in Autoregressive Structured Prediction" (ICLR 2021)
Python
4
star
34

tabgraphs

A new benchmark of meaningful tabular datasets with known graph structure
Python
3
star
35

msr

An official repository of "Multi-Sentence Resampling: A Simple Approach to Alleviate Dataset Length Bias and Beam-Search Degradation"
Python
2
star