• Stars
    star
    100
  • Rank 338,692 (Top 7 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created over 8 years ago
  • Updated almost 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A parallel implementation of the Deep Dream image processing algorithm which is able to process arbitrarily large images.

deep_dream

An implementation of the Deep Dream image processing algorithm which is able to process large (wallpaper-sized) images despite GPU or main memory limits. It is also able to use multiple processes to take advantage of several CPUs and/or GPUs.

This implementation of Deep Dream is able to divide the gradient ascent step into tiles if a too-large image is being processed. By default, any image larger than 512x512 will be divided into tiles no larger than 512x512. The tile seams are obscured by applying a random shift on each gradient ascent step (this also greatly improves the image quality by summing over the translation dependence inherent to the neural network architecture). Further, several tiles can be processed simultaneously on machines with more than one compute device (CPU or GPU).

  1. Usage
  2. Example
  3. CNN.dream_guided() example
  4. Models
  5. Pre-built Amazon image
  6. Requirements
  7. Python 3.5 build tips

Usage

Usage: deep_dream_cli.py [OPTIONS] IN_FILE [OUT_FILE]

  CLI interface to deep_dream.

Options:
  --cpu-workers INTEGER        The number of CPU workers to start.
  --gpus INTEGER_LIST          The CUDA device IDs to use.
  --guide-image TEXT           The guide image to use.
  --layers RE_LIST             The network layers to target.
  --max-input-size INTEGER...  Rescale the input image to fit into this size.
  --max-tile-size INTEGER      The maximum dimension of a tile.
  --min-size INTEGER           Don't use scales where the small edge of the
                               image is below this.
  --model TEXT                 The model to use. Valid values: GOOGLENET_BVLC,
                               GOOGLENET_PLACES205, GOOGLENET_PLACES365,
                               RESNET_50.
  --n INTEGER                  The number of iterations per scale.
  --per-octave INTEGER         The number of scales per octave.
  --smoothing FLOAT            The per-iteration smoothing factor. Try
                               0.02-0.1.
  --step-size FLOAT            The strength of each iteration.
  --tv-weight FLOAT            The per-scale denoising weight. Higher values
                               smooth the image less. Try 25-250.
  --help                       Show this message and exit.

Example

import deep_dream as dd
from PIL import Image

cnn = dd.CNN(dd.GOOGLENET_PLACES365, gpus=[0])
img = Image.open('kodim/img0022.jpg').resize((768, 512), Image.LANCZOS)

out = cnn.dream(img, 'inception_4a/output', min_size=64, per_octave=4, n=8, step_size=0.5, smoothing=0.02)
dd.to_image(out).save('example_med.jpg', quality=85)

out = cnn.dream(img, 'inception_4a/output', min_size=64, per_octave=4, n=12, step_size=1.2, smoothing=0.01)
dd.to_image(out).save('example_out.jpg', quality=85)

CNN.dream_guided() example

Input:

Guide:

Combined output:

Gradient ascent was performed using layers inception_(3a-b, 4a-e, 5a-b)/output. This is a reasonable set of layers for dream_guided() to work well. Note that the input and the guide do not have to be the same size; the output will be the same size as the input.

Models

Locations of pre-trained .caffemodel files (run get_models.sh to automatically download them):

  • bvlc_googlenet: tends toward visualizing abstract patterns, dogs, insects, and amorphous creatures.
  • googlenet_places205: tends toward visualizing buildings and landscapes.
  • googlenet_places365: newer than the places205-trained model, often more aesthetically pleasing output, tends toward visualizing buildings and landscapes.

Pre-built Amazon image

This AMI in us-west-2 contains deep_dream with all dependencies preinstalled and built for Python 3.5, and all models downloaded. It should be launched in a g2.2xlarge or g2.8xlarge instance. These instance types have 1 and 4 GPUs respectively. You can use all four GPUs from deep_dream_cli.py or deep_dream_test.py by specifying the parameter --gpus 0,1,2,3.

Requirements

  • Python 3.5.
  • Caffe, built against Python 3.5. (See the Python 3.5 build tips.) I would encourage you to use Caffe's nVidia GPU support if possible: it runs several times faster on even a laptop GPU (GeForce GT 750M) than on the CPU.
  • The contents of requirements.txt. (pip install -U -r requirements.txt)
    • openexrpython needs to be installed from git master instead of 1.2.0 from PyPI for optional OpenEXR export. (pip install -U git+https://github.com/jamesbowman/openexrpython)
  • Pre-trained Caffe models (run get_models.sh; see Models section).

Python 3.5 build tips

You will need protobuf 3 (currently in beta) for its Python 3 compatibility: 2.x will not work! Check out protobuf and build/install both the main protobuf package (C++/protoc) and the Python module in python/. Do this before attempting to build Caffe.

Linux (Tested on Ubuntu 16.04 LTS)

  • First see the Ubuntu 15.10/16.04 installation guide on the Caffe GitHub wiki.

  • Python 3.5 Makefile.config settings, with python3.5 installed via apt-get:

    PYTHON_INCLUDE := /usr/include/python3.5m \
            /usr/local/lib/python3.5/dist-packages/numpy/core/include
    PYTHON_LIB := /usr/lib
    PYTHON_LIBRARIES := boost_python-py35 python3.5m
  • I used openblas in this configuration. MKL is probably faster in CPU mode.

OS X (Tested on El Capitan 10.11)

  • Python 3.5 Makefile.config settings, with python3 installed through homebrew:

    PYTHON_DIR := /usr/local/opt/python3/Frameworks/Python.framework/Versions/3.5
    PYTHON_INCLUDE := $(PYTHON_DIR)/include/python3.5m \
            /usr/local/lib/python3.5/site-packages/numpy/core/include
    PYTHON_LIB := $(PYTHON_DIR)/lib
    PYTHON_LIBRARIES := boost_python3 python3.5m
  • This assumes you installed numpy with pip into the python3.5 system site-packages directory. If you're in a virtualenv this may change.

  • Leave the BLAS setting at atlas, unless you want to try MKL (faster in CPU mode). Recent OS X ships with an optimized multithreaded BLAS so there is little reason IMO to use openblas anymore.

More Repositories

1

k-diffusion

Karras et al. (2022) diffusion models for PyTorch
Python
1,263
star
2

v-diffusion-pytorch

v objective diffusion inference code for PyTorch.
Python
708
star
3

style-transfer-pytorch

Neural style transfer in PyTorch.
Python
442
star
4

v-diffusion-jax

v objective diffusion inference code for JAX.
Python
207
star
5

simulacra-aesthetic-models

Python
130
star
6

style_transfer

Data-parallel image stylization using Caffe.
Python
112
star
7

consistency-models

A JAX implementation of the continuous time formulation of Consistency Models
Python
74
star
8

LDLM

Latent Diffusion Language Models
Python
66
star
9

cloob-training

CLOOB training (JAX) and inference (JAX and PyTorch)
Python
63
star
10

esgd

ESGD-M is a stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch.
Python
58
star
11

mdmm

The Modified Differential Multiplier Method (MDMM) for PyTorch
Python
41
star
12

vgg_loss

A VGG-based perceptual loss function for PyTorch.
Python
38
star
13

jax-wavelets

The 2D discrete wavelet transform for JAX
Python
25
star
14

cond_transformer_2

A CLIP conditioned Decision Transformer.
Python
22
star
15

clip-guided-diffusion

CLIP Guided Diffusion
Python
14
star
16

mdmm-jax

Gradient-based constrained optimization for JAX
Python
14
star
17

tv-denoise

Total variation denoising for images.
Python
13
star
18

shared_ndarray

A pickleable wrapper for sharing NumPy ndarrays between processes using POSIX shared memory.
Python
13
star
19

pytorch-caffe-models

The original weights of some Caffe models, ported to PyTorch.
Python
10
star
20

pharmacokinetics

A Flask web application to calculate and plot drug concentration over time.
Python
10
star
21

pyparsing-highlighting

Syntax highlighting for prompt_toolkit and HTML with pyparsing.
Python
9
star
22

aiohttp_index

aiohttp.web middleware to serve index files (e.g. index.html) when static directories are requested.
Python
8
star
23

rope-flax

Rotary Position Embedding for Flax
Python
4
star
24

ucs

Implements the CAM02-UCS (Luo et al. (2006)) forward transform.
Python
4
star
25

philips-hue

A CLI tool to interface with Philips Hue lights.
Python
4
star
26

dice-mc

DiCE: The Infinitely Differentiable Monte-Carlo Estimator
Python
3
star
27

average

Exponentially weighted moving averages with initialization bias correction.
Python
3
star
28

synthraw

Synthesizes camera raw files
Python
2
star
29

huething

This is a work in progress to control my four Philips Hue bulbs.
Python
2
star
30

base58

Package base58 implements base58 encoding as used in Bitcoin addresses.
Go
2
star
31

crowsonkb.github.io

HTML
2
star
32

websynth

pNaCl-Csound based softsynth
JavaScript
2
star
33

color_schemer

A web application to translate color schemes between dark- and light-background.
Python
1
star
34

cluster

Package cluster performs hierarchical clustering of term vectors.
Go
1
star
35

gradient-maker3

A web application to generate color gradients using the CAM02-UCS colorspace.
Python
1
star
36

scihub-lookup

A Safari extension to look up the current page on Sci-Hub
JavaScript
1
star
37

fragments

Miscellaneous useful, reusable code fragments
Python
1
star
38

.zsh

My zsh configuration
Shell
1
star
39

randomness

Generates random secrets (passwords, etc).
Python
1
star