In-Domain GAN Inversion for Real Image Editing
Figure: Real image editing using the proposed In-Domain GAN inversion with a fixed GAN generator.
In-Domain GAN Inversion for Real Image Editing
Jiapeng Zhu*, Yujun Shen*, Deli Zhao, Bolei Zhou
European Conference on Computer Vision (ECCV) 2020
In the repository, we propose an in-domain GAN inversion method, which not only faithfully reconstructs the input image but also ensures the inverted code to be semantically meaningful for editing. Basically, the in-domain GAN inversion contains two steps:
- Training domain-guided encoder.
- Performing domain-regularized optimization.
NEWS: Please also find this repo, which is friendly to PyTorch users!
[Paper] [Project Page] [Demo] [Colab]
Testing
Pre-trained Models
Please download the pre-trained models from the following links. For each model, it contains the GAN generator and discriminator, as well as the proposed domain-guided encoder.
Path | Description |
---|---|
face_256x256 | In-domain GAN trained with FFHQ dataset. |
tower_256x256 | In-domain GAN trained with LSUN Tower dataset. |
bedroom_256x256 | In-domain GAN trained with LSUN Bedroom dataset. |
Inversion
MODEL_PATH='styleganinv_face_256.pkl'
IMAGE_LIST='examples/test.list'
python invert.py $MODEL_PATH $IMAGE_LIST
NOTE: We find that 100 iterations are good enough for inverting an image, which takes about 8s (on P40). But users can always use more iterations (much slower) for a more precise reconstruction.
Semantic Diffusion
MODEL_PATH='styleganinv_face_256.pkl'
TARGET_LIST='examples/target.list'
CONTEXT_LIST='examples/context.list'
python diffuse.py $MODEL_PATH $TARGET_LIST $CONTEXT_LIST
NOTE: The diffusion process is highly similar to image inversion. The main difference is that only the target patch is used to compute loss for masked optimization.
Interpolation
SRC_DIR='results/inversion/test'
DST_DIR='results/inversion/test'
python interpolate.py $MODEL_PATH $SRC_DIR $DST_DIR
Manipulation
IMAGE_DIR='results/inversion/test'
BOUNDARY='boundaries/expression.npy'
python manipulate.py $MODEL_PATH $IMAGE_DIR $BOUNDARY
NOTE: Boundaries are obtained using InterFaceGAN.
Style Mixing
STYLE_DIR='results/inversion/test'
CONTENT_DIR='results/inversion/test'
python mix_style.py $MODEL_PATH $STYLE_DIR $CONTENT_DIR
Training
The GAN model used in this work is StyleGAN. Beyond the original repository, we make following changes:
- Change repleated
$w$ for all layers to different $w$s (Line 428-435 in filetraining/networks_stylegan.py
). - Add the domain-guided encoder in file
training/networks_encoder.py
. - Add losses for training the domain-guided encoder in file
training/loss_encoder.py
. - Add schedule for training the domain-guided encoder in file
training/training_loop_encoder.py
. - Add a perceptual model (VGG16) for computing perceptual loss in file
perceptual_model.py
. - Add training script for the domain-guided encoder in file
train_encoder.py
.
Step-1: Train your own generator
python train.py
Step-2: Train your own encoder
TRAINING_DATA=PATH_TO_TRAINING_DATA
TESTING_DATA=PATH_TO_TESTING_DATA
DECODER_PKL=PATH_TO_GENERATOR
python train_encoder.py $TRAINING_DATA $TESTING_DATA $DECODER_PKL
Note that the file dataset_tool.py
, which is borrowed from the StyleGAN repo, is used to prepared a directory of data from all resolutions. The training of the encoder does not rely on the progressive strategy, therefore, the training data and the test data should be both specified as the .tfrecords
file with the highest resolution.
BibTeX
@inproceedings{zhu2020indomain,
title = {In-domain GAN Inversion for Real Image Editing},
author = {Zhu, Jiapeng and Shen, Yujun and Zhao, Deli and Zhou, Bolei},
booktitle = {Proceedings of European Conference on Computer Vision (ECCV)},
year = {2020}
}