StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
See you @ Siggraph 2021
Figure: Sequential edits using StyleFlow
High-quality, diverse, and photorealistic images can now be generated by unconditional GANs (e.g., StyleGAN). However, limited options exist to control the generation process using (semantic) attributes, while still preserving the quality of the output. Further, due to the entangled nature of the GAN latent space, performing edits along one attribute can easily result in unwanted changes along other attributes. In this paper, in the context of conditional exploration of entangled latent spaces, we investigate the two sub-problems of attribute-conditioned sampling and attribute-controlled editing. We present StyleFlow as a simple, effective, and robust solution to both the sub-problems by formulating conditional exploration as an instance of conditional continuous normalizing flows in the GAN latent space conditioned by attribute features. We evaluate our method using the face and the car latent space of StyleGAN, and demonstrate fine-grained disentangled edits along various attributes on both real photographs and StyleGAN generated images. For example, for faces, we vary camera pose, illumination variation, expression, facial hair, gender, and age. Finally, via extensive qualitative and quantitative comparisons, we demonstrate the superiority of StyleFlow to other concurrent works.
StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows (ACM TOG 2021)
Rameen Abdal, Peihao Zhu, Niloy Mitra, Peter Wonka
KAUST, Adobe Research
[Paper] [Project Page] [Demo] [Promotional Video]
Installation
Clone this repo.
git clone https://github.com/RameenAbdal/StyleFlow.git
cd StyleFlow/
This code requires PyTorch, TensorFlow, Torchdiffeq, Python 3+ and Pyqt5. Please install dependencies by
conda env create -f environment.yml
StyleGAN2 relies on custom TensorFlow ops that are compiled on the fly using NVCC. To correctly setup the StyleGAN2 generator follow the Requirements in this repo.
Installation (Docker)
Clone this repo.
git clone https://github.com/RameenAbdal/StyleFlow.git
cd StyleFlow/
You must have CUDA (>=10.0 && <11.0) and nvidia-docker2 installed first !
Then, run :
xhost +local:docker # Letting Docker access X server
wget -P stylegan/ http://d36zk2xti64re0.cloudfront.net/stylegan2/networks/stylegan2-ffhq-config-f.pkl
docker-compose up --build # Expect some time before UI appears
When finished, run :
xhost -local:docker
UI Illustration
Loading images may take 2 - 3 seconds on the first click. Move the slider smoothly to render results.
Editing Images Using Pretrained Models
-
Run the main UI
python main.py
-
Run the Attribute Transfer UI
python main_attribute.py
Web UI (Beta)
A web based UI is also now available. Follow webui dev branch for setup.
Training New Model
Dataset containing sampled StyleGAN2 latents, lighting SH parameters and other attributes. (Download Here)
Create ./data_numpy/
in the main folder and extract the above data or create your own dataset.
Train your model:
python train_flow.py
Projection
Our new projection method is currently under review. To be updated! Follow the repo for updates : https://github.com/ZPdesu/II2S
License
All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International). The code is released for academic research use only.
Citation
If you use this research/codebase/dataset, please cite our papers.
@article{10.1145/3447648,
author = {Abdal, Rameen and Zhu, Peihao and Mitra, Niloy J. and Wonka, Peter},
title = {StyleFlow: Attribute-Conditioned Exploration of StyleGAN-Generated Images Using Conditional Continuous Normalizing Flows},
year = {2021},
issue_date = {May 2021},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {40},
number = {3},
issn = {0730-0301},
url = {https://doi.org/10.1145/3447648},
doi = {10.1145/3447648},
abstract = {High-quality, diverse, and photorealistic images can now be generated by unconditional GANs (e.g., StyleGAN). However, limited options exist to control the generation process using (semantic) attributes while stillpreserving the quality of the output. Further, due to the entangled nature of the GAN latent space, performing edits along one attribute can easily result in unwanted changes along other attributes. In this article, in the context of conditional exploration of entangled latent spaces, we investigate the two sub-problems of attribute-conditioned sampling and attribute-controlled editing. We present StyleFlow as a simple, effective, and robust solution to both the sub-problems by formulating conditional exploration as an instance of conditional continuous normalizing flows in the GAN latent space conditioned by attribute features. We evaluate our method using the face and the car latent space of StyleGAN, and demonstrate fine-grained disentangled edits along various attributes on both real photographs and StyleGAN generated images. For example, for faces, we vary camera pose, illumination variation, expression, facial hair, gender, and age. Finally, via extensive qualitative and quantitative comparisons, we demonstrate the superiority of StyleFlow over prior and several concurrent works. Project Page and Video: https://rameenabdal.github.io/StyleFlow.},
journal = {ACM Trans. Graph.},
month = may,
articleno = {21},
numpages = {21},
keywords = {image editing, Generative adversarial networks}
}
@INPROCEEDINGS{9008515,
author={Abdal, Rameen and Qin, Yipeng and Wonka, Peter},
booktitle={2019 IEEE/CVF International Conference on Computer Vision (ICCV)},
title={Image2StyleGAN: How to Embed Images Into the StyleGAN Latent Space?},
year={2019},
volume={},
number={},
pages={4431-4440},
doi={10.1109/ICCV.2019.00453}}
Broader Impact
Important : Deep learning based facial imagery like DeepFakes and GAN generated images can be gravely misused. This can spread misinformation and lead to other offences. The intent of our work is not to promote such practices but instead be used in the areas such as identification (novel views of a subject, occlusion inpainting etc. ), security (facial composites etc.), image compression (high quality video conferencing at lower bitrates etc.) and development of algorithms for detecting DeepFakes.
Acknowledgments
This implementation builds upon the awesome work done by Karras et al. (StyleGAN2), Chen et al. (torchdiffeq) and Yang et al. (PointFlow). This work was supported by Adobe Research and KAUST Office of Sponsored Research (OSR).