GAN-Control: Explicitly Controllable GANs

This is a PyTorch implementation of the following paper:

GAN-Control: Explicitly Controllable GANs, ICCV 2021, [paper] [project page].

Alon Shoshan, Nadav Bhonker, Igor Kviatkovsky and Gerard Medioni.

Abstract:
We present a framework for training GANs with explicit control over generated facial images. We are able to control the generated image by settings exact attributes such as age, pose, expression, etc. Most approaches for manipulating GAN-generated images achieve partial control by leveraging the latent space disentanglement properties, obtained implicitly after standard GAN training. Such methods are able to change the relative intensity of certain attributes, but not explicitly set their values. Recently proposed methods, designed for explicit control over human faces, harness morphable 3D face models (3DMM) to allow fine-grained control capabilities in GANs. Unlike these methods, our control is not constrained to 3DMM parameters and is extendable beyond the domain of human faces. Using contrastive learning, we obtain GANs with an explicitly disentangled latent space. This disentanglement is utilized to train control-encoders mapping human-interpretable inputs to suitable latent vectors, thus allowing explicit control. In the domain of human faces we demonstrate control over identity, age, pose, expression, hair color and illumination. We also demonstrate control capabilities of our framework in the domains of painted portraits and dog image generation. We demonstrate that our approach achieves state-of-the-art performance both qualitatively and quantitatively.

Explicitly controlling face attributes as illumination, pose, expression, hair color and age:

Explicitly controlling painting attributes as pose, expression, and age:

Changing the artistic style of paintings while maintaining all other attributes:

Explicitly controlling the pose of generated images of dogs:

Inference

Download the trained GAN and save it in resources/gan_models.

Examples on how to explicitly and implicitly control the GAN's generation can be found in notebooks/gan_control_inference_example.ipynb.

Examples include:

Explicitly controlling pose.
Explicitly controlling age.
Explicitly controlling hair color.
Explicitly controlling illumination.
Explicitly controlling expression.
Accessing and implicitly modifying the GAN's latent space.

Training

The training process consists of two phases:

Training a disentangled GAN.
Training control/attribute encoders:
1. Constructing a {control/attribute : w latent} dataset.
2. Training control encoders.

Phase 1: Training a disentangled GAN

Use one of the configs in src/gan_control/configs: ffhq.json for faces, metfaces.json for paintings and afhq.json for dogs.
In the config, edit data_config.path to point to your dataset directory.
Prepare the pretrained predictors (see: Prepare pretrained predictors) and save them in src/gan_control/pretrained_models.
Download the inception statistics (for FID calculations) and save them in src/gan_control/inception_stat.
Run python src/gan_control/train_generator.py --config_path <your config>.

Training results will be saved in <results_dir (given in the config file)>/<save_name (given in the config file)>_<some hyper parms>_<time>. This phase was trained on 4 Nvidia V100 GPUs with a batch size of 16 (batch of 4 per GPU).

Phase 2: Training control/attribute encoders

Constructing a {control/attribute : w latent} dataset

Run python src/gan_control/make_attributes_df.py --model_dir <path to the GAN directory from phase 1> --save_path <dir where the dataset will be saved>/<dataset name>.pkl.

Dataset will be saved in a form of a Dataframe in save_path.

Training control encoders

For each attribute you want to control:

Edit the corresponding config from src/gan_control/configs/controller_configs.
1. In: generator_dir write the path to your GAN directory from phase 1.
2. In: sampled_df_path write the path to the {control/attribute : w latent} dataset (path to Dataframe).
Run: python src/gan_control/train_controller.py --config_path <your config from 1>.

Your GAN will be saved in <"results_dir" in config>/<"save_name" in config>. This phase was trained on 1 Nvidia V100 GPU with a batch size of 128.

Faster training using custom CUDA kernels

For faster training, you can follow Rosinality: StyleGAN 2 in PyTorch and add custom CUDA kernels, similar to here, to line 18 in gan_model.py and set FUSED = True in line 15.

Datasets

This work supports the following datasets:

Prepare pretrained predictors

Following are instructions to download and prepare the predictors used for running our code:

ArcFace (ID): Download model_ir_se50.pth from InsightFace_Pytorch.
Hopenet (Pose): Download hopenet_robust_alpha1.pkl from deep-head-pose.
ESR (Expression): Download the directory named esr_9 from Efficient Facial Feature Learning and save it as is in src/gan_control/pretrained_models.
R-Net (Illumination): Download the pytorch R-Net model from here. This model is converted to pytorch from the tensorflow model published by Deep3DFaceReconstruction.
PSPNet (Hair segmentation for hair color): Download pspnet_resnet101_sgd_lr_0.002_epoch_100_test_iou_0.918.pth from pytorch-hair-segmentation.
DogFaceNet (Dog ID): Download the pytorch DogFaceNet model from here. This model is converted to pytorch from the tensorflow model published by DogFaceNet.
DEX (Age):
1. Download the caffe dex_imdb_wiki.caffemodel model from IMDB-WIKI.
2. Convert the model to pytorch. You can use this converter. The predictors should be saved in src/gan_control/pretrained_models

Citation

Please consider citing our work if you find it useful for your research:

@InProceedings{Shoshan_2021_ICCV,
    author    = {Shoshan, Alon and Bhonker, Nadav and Kviatkovsky, Igor and Medioni, G\'erard},
    title     = {GAN-Control: Explicitly Controllable GANs},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
}

Acknowledgments

This code is heavily borrowed from Rosinality: StyleGAN 2 in PyTorch.

This code uses the following models:

ArcFace (ID): InsightFace_Pytorch
Hopenet (Pose): deep-head-pose
ESR (Expression): Efficient Facial Feature Learning
R-Net (Illumination): Deep3DFaceReconstruction
DEX (Age): IMDB-WIKI
PSPNet (Hair segmentation for hair color): pytorch-hair-segmentation
DogFaceNet (Dog ID): DogFaceNet

This code uses face-alignment for face alignment.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

amazon-science/gan-control

amazon-science

Reviews

Repository Details