• Stars
    star
    116
  • Rank 293,219 (Top 6 %)
  • Language
    Python
  • Created over 3 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

[ICML 2021] Official implementation: Intermediate Layer Optimization for Inverse Problems using Deep Generative Models

Official Implementation: Intermediate Layer Optimization for Inverse Problems using Deep Generative Models

This repository hosts the official PyTorch implementation of the paper: "Intermediate Layer Optimization for Inverse Problems using Deep Generative Models".

Important Note: The results of this paper have been improved by our ICML 22 work: Score-Guided Intermediate Layer Optimization: Fast Langevin Mixing for Inverse Problems(code). This new work achieves greater results, especially in the extremely sparse measurements regime.

Paper: https://arxiv.org/abs/2102.07364

Authored by: Giannis Daras, Joseph Dean (equal contribution), Ajil Jalal, Alexandros G. Dimakis

Colab demo:

Colab

Abstract

We propose Intermediate Layer Optimization (ILO), a novel optimization algorithm for solving inverse problems with deep generative models. Instead of optimizing only over the initial latent code, we progressively change the input layer obtaining successively more expressive generators. To explore the higher dimensional spaces, our method searches for latent codes that lie within a small l1 ball around the manifold induced by the previous layer. Our theoretical analysis shows that by keeping the radius of the ball relatively small, we can improve the established error bound for compressed sensing with deep generative models. We empirically show that our approach outperforms state-of-the-art methods introduced in StyleGAN2 and PULSE for a wide range of inverse problems including inpainting, denoising, super-resolution and compressed sensing.

Results

Inpainting

Super-resolution

Denoising

Morphing using a robust classifier

Installation

First install the python dependencies by running pip install -r requirements.txt.

Next, download dependency files:

gdown --id 1c1qtz3MVTAvJpYvsMIR5MoSvdiwN2DGb (shape predictor)

gdown --id 1JCBiKY_yUixTa6F1eflABL88T4cii2GR (stylegan pre-trained checkpoint)

If you don't have gdown installed, run: pip install gdown first.

Examples

Image Preprocessing

Our prepare_image.py script offers some basic image preprocessing utilities. A basic config for the script is given in configs/preprocess.yaml. To change the image preprocessing task or to stack various preprocessing operations together, you need to adjust the arguments of the script. You can do that by either adjusting the file directly or by passing the corresponding CLI arguments as shown in the following examples.

Interactive masking

You can use this tool to create (interactively) masks to your images.

Example command:

python prepare_image.py preprocessing=\[interactive_mask\] input_files=\[files/original/turing.png,files/original/mona_lisa.png\]

Hard coded mask

You can hard code the location of the mask.

Example command:

python prepare_image.py preprocessing=\[mask\] input_files=\[files/original/mona_lisa.png\] bounding_box.horizontal=\[100,1000\] bounding_box.vertical=\[100,1000\]

Random mask

Do random inpainting.

Example command:

python prepare_image.py preprocessing=\[remove_pixels\] input_files=\[files/original/mona_lisa.png\] observed_percentage=\[10,20,30\] per_input=3

Automatically align images

Example command:

python prepare_image.py preprocessing=\[align\] input_files=\[files/original/mona_lisa.png\]

Chain operations

Example command:

python prepare_image.py preprocessing=\[interactive_mask,remove_pixels\] input_files=\[files/original/mona_lisa.png\]

Solving inverse problems

The script main.py can be used to do inpainting, super-resolution, denoising, out-of-distribution generation, etc. The script in default mode is configured by the file configs/config.yaml.

A config file should have the following form:

# configures the working direction of hydra.
# If using relative paths, the working directory should be the root folder of ILO.
hydra:
  run:
    dir: '.'

# only StyleGAN is supported for now.
model_type: 'stylegan'

stylegan:
  seed: 42
  device: cuda

  # The pre-trained StyleGAN checkpoint
  ckpt: stylegan2-ffhq-config-f.pt

  ## weights of different losses
  geocross: 0.01
  mse: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
  pe: [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
  dead_zone_linear: [1.0, 1.0, 1.0, 1.0]
  # tolerance of dead zone linear function
  dead_zone_linear_alpha: 0.1
  # Additional loss term to a reference image. \
  # Leave it to 0, unless you know what you are doing.
  reference_loss: 0.0

  # hack for making lpips work
  lpips_method: 'fill'
  # classifier used in LPIPS loss
  cls_name: vgg16


  ## Task specific

  # circulant matrices
  fast_compress: false
  observed_percentage: 80

  # can be decreased for super-resolution
  image_size:
    - 1024
    - 1024

  # controls whether we want to match black pixels. Leave to true for inpainting.
  mask_black_pixels: true

  ## Optimization

  # how many steps to run per layer
  steps: '1000, 1000, 1000, 1000, 1000, 1000'
  # whether to project latents to unit ball
  project: true
  # learning rate per layer
  lr: [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]
  # whether to schedule per layer or in total
  lr_same_pace: false
  # Which layer to start optimizing from. Use 0, unless there are saved noises.
  # If you want to skip optimization in some layers, just use 0 to the corresponding indices of steps.
  start_layer: 0
  end_layer: 8
  # Whether to restore opt. variables from previous run.
  restore: false
  # paths to previous opt. variables
  saved_noises:
    - files/noises.pt
    - files/latent.pt
    - files/gen_outs.pt
  # projections
  do_project_gen_out: false
  do_project_noises: false
  do_project_latent: false
  max_radius_gen_out: [1000, 1000, 6000]
  max_radius_noises: [1000, 1000, 6000]
  max_radius_latent: [100, 1000, 6000]

  # specific to video. Leave it as is, unless you know what you are doing.
  is_video: false
  max_frame_radius_gen_out: [200]
  max_frame_radius_noises: [5]
  max_frame_radius_latent: [200]
  video_freq: 30
  per_frame_steps: '100'

  ## files
  # If is_sequence=True, then the input_files and the output_files should be directories.
  is_sequence: false
  input_files:
    - files/original/turing.png
  output_files:
    - files/inpainting/turing.png

  ## specific to datasets
  dataset_type: CelebaHQ
  # if is_dataset=true, then we are sampling from a dataset instead of having fixed files.
  is_dataset: false
  # how many samples to get from dataset. Is activated only when is_dataset=true
  num_dataset: 1
  # extension of dataset files (for glob)
  files_ext: '.png'

  # logging
  # if save_latent=True, then the optimization variables will be saved and can be used in later run.
  save_latent: false
  # if true, intermediate frames from optimization are saved.
  save_gif: false
  # determines how often we save intermediate frames. Activated only if save_gif=True.
  save_every: 50
  # determined whether we our final generated image is the one with the lowest MSE to a reference image.
  save_on_ref: false

The role of each parameter is explained in the comments. Below, we give some example commands for different inverse problems. For all the examples we are using the default value for input_files, but feel free to change it by passing the appropriate CLI argument (or by changing the config file) to run it with your own images.

For any of these tasks, you can either enable l1 projections or no. Enabling l1 projections helps controlling how close the generated images are to the range of StyleGAN. The caveat is that tuning the radii of these projections can be particularly toilsome. For most settings, you will be able to get particularly good results with disabled projections. Projections are particularly useful when considering out-of-distribution generation. See there for an example on how to use them.

Inpainting

For inpainting, we want to disable perceptual loss and use only MSE loss. If we have enough measurements, we enable LPIPS loss as well.

Example command:

python main.py stylegan.mse=\[1.0,1.0,1.0,1.0\] stylegan.pe=\[0.0,0.0,0.0,0.0\] stylegan.steps=\[25,25,25,25\]

Super-resolution

For super-resolution, you only need to specify image-size. If the original image is higher res, BicubicDownSample will be used for downscaling prior to inversion.

Example command:

python main.py stylegan.image_size=\[256,256\] stylegan.mse=\[1.0,1.0,1.0,1.0\] stylegan.pe=\[1.0,1.0,1.0,1.0\] stylegan.steps=\[25,25,25,25\]

If using LPIPS, make use that your image is at least 256x256.

Compressed sensing with circulant matrices

For compressed sensing with partial circulant matrices, we need to enable the --fast-compress argument and specify the --observed percentage.

Example command: python main.py stylegan.mse=\[1.0,1.0,1.0,1.0\] stylegan.pe=\[0.0,0.0,0.0,0.0\] stylegan.steps=\[300,200,100\] stylegan.fast_compress=1 stylegan.observed_percentage=50

Acknowledgments and License

We use the StyleGAN-2 PyTorch implementation of the following repository: https://github.com/rosinality/stylegan2-pytorch. We wholeheartedly thank the author for open-sourcing this implementation.

The PyTorch implementation is based on the official Tensorflow implementation: https://github.com/NVlabs/stylegan2. We are grateful to the authors of StyleGAN-2 for their work and their open-sourced code and models.

Please refer to the license files listed in the repositories 1, 2.

More Repositories

1

ylg

[CVPR 2020] Official Implementation: "Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models".
Python
134
star
2

multires_textual_inversion

[NeurIPS 2022: Score-Based Modeling Workshop] Multiresolution Textual Inversion
Python
96
star
3

smyrf

[NeurIPS 2020] Official Implementation: "SMYRF: Efficient Attention using Asymmetric Clustering".
Python
47
star
4

cdm

[NeurIPS 2023] Official Implementation: "Consistent Diffusion Models"
Python
46
star
5

ambient-diffusion

[NeurIPS 2023] Official Implementation: "Ambient Diffusion: Learning Clean Distributions from Corrupted Data"
Python
34
star
6

spaCyIRL_slides

Slides from my talk on spaCy IRL, regarding sparse attention.
11
star
7

sgilo

[ICML 2022] Official implementation of "Score-Guided Intermediate Layer Optimization: Fast Langevin Mixing for Inverse Problems".
Python
11
star
8

multilingual_robustness

[NeurIPS 2022] Multitasking Models are Robust to Structural Failure: A Neural Model for Bilingual Cognitive Reserve
Python
10
star
9

ntua-lambda

This repository expects to be a place to find code/resources/examples and more, related to the NTUA lambda flow.
HTML
9
star
10

TedxNtua2018

This repository hosts the TEDx NTUA 2018 website
PHP
7
star
11

Machine-Learning-With-Python

This repository contains implementations of popular machine learning classifiers in python. It also uses libraries such as scikit-learn for testing the behavior of those implementations in popular datasets.
Python
6
star
12

2019_coding_goals

A list of goals for an interesting coding year
6
star
13

ratle-website

Code for the website of Ratle startup. Wait less, shop more! Happy shopping :)
HTML
4
star
14

ntua-stochastic-processes

Stohastic processes course of National Technical University of Athens.
HTML
3
star
15

Celeste

Smarthome prototype. Uses machine learning to adjust home behavior based on the current home user.
Python
3
star
16

slides

Python
2
star
17

inverse

Python
1
star
18

codejam2018

Attempts and solutions to codejam problems 2018.
Python
1
star
19

PresentPerfect

Social networking Android application for presents exchange!
Java
1
star
20

Greek2Latex

Greek2Latex is a converter of greek text to the equivalent text-commands for latex. When you need to type a greek text inside latex math mode that is the best solution!
Python
1
star