• Stars
    star
    119
  • Rank 288,667 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created about 4 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Disentangling Invertible Interpretation Network

A Disentangling Invertible Interpretation Network for Explaining Latent Representations

PyTorch code accompanying the CVPR 2020 paper

A Disentangling Invertible Interpretation Network for Explaining Latent Representations
Patrick Esser*, Robin Rombach*, Björn Ommer
* equal contribution

teaser
arXiv | BibTeX | Project Page

Table of Contents

Requirements

A suitable conda environment named iin can be created and activated with:

conda env create -f environment.yaml
conda activate iin

Optionally, you can then also conda install tensorflow-gpu=1.14 to speed up FID evaluations.

Data

MNIST, FashionMNIST and CIFAR10 will be downloaded automatically the first time they are used and CelebA will prompt you to download it. The content of each dataset can be visualized with

edexplore --dataset iin.data.<dataset>

where <dataset> is one of MNISTTrain, MNISTTest, FashionMNISTTrain, FashionMNISTTest, CIFAR10Train, CIFAR10Test, CelebATrain, CelebATest, FactorCelebATrain, FactorCelebATest, ColorfulMNISTTrain, ColorfulMNISTTest, SingleColorfulMNISTTrain, SingleColorfulMNISTTest.

Training

Autoencoders

To train autoencoders, run

edflow -b configs/<dataset>_ae.yaml -t

where <dataset> is one of mnist, fashionmnist, cifar, celeba, cmnist. To enable logging to wandb, adjust configs/project.yaml and add it to above command:

edflow -b configs/<dataset>_ae.yaml configs/project.yaml -t

Classifiers

To train a classifier on ColorfulMNIST, run

edflow -b configs/cmnist_clf.yaml -t

Once you have a checkpoint, you can estimate factor dimensionalities using

edflow -b configs/cmnist_clf.yaml configs/cmnist_dimeval.yaml -c <path to .ckpt>

For the pretrained classifier, this gives

[INFO] [dim_callback]: estimated factor dimensionalities: [22, 11, 31]

and to compare this to an autoencoder, run

edflow -b configs/cmnist_ae.yaml configs/cmnist_dimeval.yaml -c <path to cmnist ae .ckpt>

which gives

[INFO] [dim_callback]: estimated factor dimensionalities: [13, 17, 34]

Invertible Interpretation Networks

Unsupervised on AE

To train unsupervised invertible interpretation networks, run

edflow -b configs/<dataset>_iin.yaml [configs/project.yaml] -t

where <dataset> is one of mnist, fashionmnist, cifar, celeba. If, instead of using one of the pretrained models, you trained an autoencoder yourself, adjust the first_stage config section accordingly.

Supervised

For supervised, disentangling IINs, run

edflow -b configs/<dataset>_diin.yaml [configs/project.yaml] -t

where <dataset> is one of cmnist or celeba, or run

edflow -b configs/cmnist_clf_diin.yaml [configs/project.yaml] -t

to train a dIIN on top of a classifier, with factor dimensionalities as estimated above (dimensionalities of factors can be adjusted via the Transformer/factor_config configuration entry).

Evaluation

Evaluations run automatically after each epoch of training. To start an evaluation manually, run

edflow -p logs/<log_folder>/configs/<config>.yaml

and, optionally, add -c <path to checkpoint> to evaluate a specific checkpoint instead of the last one.

Pretrained Models

Download logs.tar.gz (~2.2 GB) and extract the pretrained models:

tar xzf logs.tar.gz

Results

Using spectral normalization for the discriminator, this code slightly improves upon the values reported in Tab. 2 of the paper.

Dataset Checkpoint FID
MNIST 105600 5.252
FashionMNIST 110400 9.663
CelebA 84643 19.839
CIFAR10 32000 38.697

Full training logs can be found on Weights & Biases.

BibTeX

@inproceedings{esser2020invertible,
  title={A Disentangling Invertible Interpretation Network for Explaining Latent Representations},
  author={Esser, Patrick and Rombach, Robin and Ommer, Bj{\"o}rn},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

More Repositories

1

stable-diffusion

A latent text-to-image diffusion model
Jupyter Notebook
64,474
star
2

latent-diffusion

High-Resolution Image Synthesis with Latent Diffusion Models
Jupyter Notebook
10,221
star
3

taming-transformers

Taming Transformers for High-Resolution Image Synthesis
Jupyter Notebook
5,244
star
4

adaptive-style-transfer

source code for the ECCV18 paper A Style-Aware Content Loss for Real-time HD Style Transfer
Python
710
star
5

vunet

A generative model conditioned on shape and appearance.
Python
492
star
6

geometry-free-view-synthesis

Is a geometric model required to synthesize novel views from a single image?
Python
356
star
7

metric-learning-divide-and-conquer

Source code for the paper "Divide and Conquer the Embedding Space for Metric Learning", CVPR 2019
Python
262
star
8

net2net

Network-to-Network Translation with Conditional Invertible Neural Networks
Python
217
star
9

image2video-synthesis-using-cINNs

Implementation of Stochastic Image-to-Video Synthesis using cINNs.
Python
179
star
10

brushstroke-parameterized-style-transfer

TensorFlow implementation of our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".
Python
158
star
11

imagebart

ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis
Python
119
star
12

content-style-disentangled-ST

Content and Style Disentanglement for Artistic Style Transfer [ICCV19]
89
star
13

retrieval-augmented-diffusion-models

Official codebase for the Paper “Retrieval-Augmented Diffusion Models”
Jupyter Notebook
83
star
14

fm-boosting

Boosting Latent Diffusion with Flow Matching
73
star
15

unsupervised-disentangling

Python
54
star
16

invariances

Making Sense of CNNs: Interpreting Deep Representations & Their Invariances with Invertible Neural Networks
Python
52
star
17

interactive-image2video-synthesis

Python
51
star
18

ipoke

iPOKE: Poking a Still Image for Controlled Stochastic Video Synthesis
Python
46
star
19

unsupervised-part-segmentation

Code for GCPR 2020 Oral : "Unsupervised Part Discovery by Unsupervised Disentanglement"
Jupyter Notebook
30
star
20

instant-lora-composition

29
star
21

behavior-driven-video-synthesis

Python
26
star
22

content-targeted-style-transfer

Content Transformation Block For Image Style Transfer [CVPR19]
24
star
23

robust-disentangling

Unsupervised Robust Disentangling of Latent Characteristics for Image Synthesis
Python
23
star
24

metric-learning-divide-and-conquer-improved

Source code for the paper "Improving Deep Metric Learning byDivide and Conquer"
Python
19
star
25

cuneiform-sign-detection-dataset

Dataset provided with the article "Deep learning for cuneiform sign detection with weak supervision using transliteration alignment". It comprises image references, transliterations and sign annotations of clay tablets from the Neo-Assyrian epoch.
Jupyter Notebook
11
star
26

visual-search

Visual search interface
10
star
27

magnify-posture-deviations

Unsupervised Magnification of Posture Deviations Across Subjects
8
star
28

cuneiform-sign-detection-code

Code for the article "Deep learning of cuneiform sign detection with weak supervision using transliteration alignment"
Jupyter Notebook
7
star
29

hbugen2018

Towards Learning a Realistic Rendering of Human Behavior
7
star
30

zigma

7
star
31

cuneiform-sign-detection-webapp

Code for demo web application of the article "Deep learning for cuneiform sign detection with weak supervision using transliteration alignment".
JavaScript
4
star
32

Characterizing_Generalization_in_DML

Python
3
star
33

AutomaticBehaviorAnalysis_NatureComm

Source Code + Documentation of our Automatic Behavior Analysis Software
MATLAB
3
star
34

depth-fm

DepthFM: Fast Monocular Depth Estimation with Flow Matching
Jupyter Notebook
3
star
35

network-fusion

1
star