• Stars
    star
    442
  • Rank 98,677 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Neural style transfer in PyTorch.

style-transfer-pytorch

An implementation of neural style transfer (A Neural Algorithm of Artistic Style) in PyTorch, supporting CPUs and Nvidia GPUs. It does automatic multi-scale (coarse-to-fine) stylization to produce high-quality high resolution stylizations, even up to print resolution if the GPUs have sufficient memory. If two GPUs are available, they can both be used to increase the maximum resolution. (Using two GPUs is not faster than using one.)

The algorithm has been modified from that in the literature by:

  • Using the PyTorch pre-trained VGG-19 weights instead of the original VGG-19 weights

  • Changing the padding mode of the first layer of VGG-19 to 'replicate', to reduce edge artifacts

  • When using average or L2 pooling, scaling the result by an empirically derived factor to ensure that the magnitude of the result stays the same on average (Gatys et al. (2015) did not do this)

  • Using Wasserstein-2 style loss

  • Taking an exponential moving average over the iterates to reduce iterate noise (each new scale is initialized with the previous scale's averaged iterate)

  • Warm-starting the Adam optimizer with scaled-up versions of its first and second moment buffers at the beginning of each new scale, to prevent noise from being added to the iterates at the beginning of each scale

  • Using non-equal weights for the style layers to improve visual quality

  • Stylizing the image at progressively larger scales, each greater by a factor of sqrt(2) (this is improved from the multi-scale scheme given in Gatys et al. (2016))

Example outputs (click for the full-sized version)

Installation

Python 3.6+ is required.

PyTorch is required: follow their installation instructions before proceeding. If you do not have an Nvidia GPU, select None for CUDA. On Linux, you can find out your CUDA version using the nvidia-smi command. PyTorch packages for CUDA versions lower than yours will work, but select the highest you can.

To install style-transfer-pytorch, first clone the repository, then run the command:

pip install -e PATH_TO_REPO

This will install the style_transfer CLI tool. style_transfer uses a pre-trained VGG-19 model (Simonyan et al.), which is 548MB in size, and will download it when first run.

If you have a supported GPU and style_transfer is using the CPU, try using the argument --device cuda:0 to force it to try to use the first CUDA GPU. This should print an informative error message.

Colab

You can try style_transfer without installing it locally by using the official Colab.

Basic usage

style_transfer CONTENT_IMAGE STYLE_IMAGE [STYLE_IMAGE ...] [-o OUTPUT_IMAGE]

Input images will be converted to sRGB when loaded, and output images have the sRGB colorspace. If the output image is a TIFF file, it will be written with 16 bits per channel. Alpha channels in the inputs will be ignored.

style_transfer has many optional arguments: run it with the --help argument to see a full list. Particularly notable ones include:

  • --web enables a simple web interface while the program is running that allows you to watch its progress. It runs on port 8080 by default, but you can change it with --port. If you just want to view the current image and refresh it manually, you can go to /image.

  • --devices manually sets the PyTorch device names. It can be set to cpu to force it to run on the CPU on a machine with a supported GPU, or to e.g. cuda:1 (zero indexed) to select the second CUDA GPU. Two GPUs can be specified, for instance --devices cuda:0 cuda:1. style_transfer will automatically use the first visible CUDA GPU, falling back to the CPU, if it is omitted.

  • -s (--end-scale) sets the maximum image dimension (height and width) of the output. A large image (e.g. 2896x2172) can take around fifteen minutes to generate on an RTX 3090 and will require nearly all of its 24GB of memory. Since both memory usage and runtime increase linearly in the number of pixels (quadratically in the value of the --end-scale parameter), users with less GPU memory or who do not want to wait very long are encouraged to use smaller resolutions. The default is 512.

  • -sw (--style-weights) specifies factors for the weighted average of multiple styles if there is more than one style image specified. These factors are automatically normalized to sum to 1. If omitted, the styles will be blended equally.

  • -cw (--content-weight) sets the degree to which features from the content image are included in the output image. The default is 0.015.

  • -tw (--tv-weight) sets the strength of the smoothness prior. The default is 2.

References

  1. L. Gatys, A. Ecker, M. Bethge (2015), "A Neural Algorithm of Artistic Style"

  2. L. Gatys, A. Ecker, M. Bethge, A. Hertzmann, E. Shechtman (2016), "Controlling Perceptual Factors in Neural Style Transfer"

  3. J. Johnson, A. Alahi, L. Fei-Fei (2016), "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

  4. A. Mahendran, A. Vedaldi (2014), "Understanding Deep Image Representations by Inverting Them"

  5. D. Kingma, J. Ba (2014), "Adam: A Method for Stochastic Optimization"

  6. K. Simonyan, A. Zisserman (2014), "Very Deep Convolutional Networks for Large-Scale Image Recognition"

More Repositories

1

k-diffusion

Karras et al. (2022) diffusion models for PyTorch
Python
1,263
star
2

v-diffusion-pytorch

v objective diffusion inference code for PyTorch.
Python
711
star
3

v-diffusion-jax

v objective diffusion inference code for JAX.
Python
210
star
4

simulacra-aesthetic-models

Python
130
star
5

style_transfer

Data-parallel image stylization using Caffe.
Python
112
star
6

deep_dream

A parallel implementation of the Deep Dream image processing algorithm which is able to process arbitrarily large images.
Jupyter Notebook
100
star
7

consistency-models

A JAX implementation of the continuous time formulation of Consistency Models
Python
74
star
8

LDLM

Latent Diffusion Language Models
Python
66
star
9

cloob-training

CLOOB training (JAX) and inference (JAX and PyTorch)
Python
63
star
10

esgd

ESGD-M is a stochastic non-convex second order optimizer, suitable for training deep learning models, for PyTorch.
Python
58
star
11

mdmm

The Modified Differential Multiplier Method (MDMM) for PyTorch
Python
41
star
12

vgg_loss

A VGG-based perceptual loss function for PyTorch.
Python
38
star
13

jax-wavelets

The 2D discrete wavelet transform for JAX
Python
25
star
14

cond_transformer_2

A CLIP conditioned Decision Transformer.
Python
22
star
15

clip-guided-diffusion

CLIP Guided Diffusion
Python
14
star
16

mdmm-jax

Gradient-based constrained optimization for JAX
Python
14
star
17

shared_ndarray

A pickleable wrapper for sharing NumPy ndarrays between processes using POSIX shared memory.
Python
13
star
18

tv-denoise

Total variation denoising for images.
Python
13
star
19

pytorch-caffe-models

The original weights of some Caffe models, ported to PyTorch.
Python
10
star
20

pharmacokinetics

A Flask web application to calculate and plot drug concentration over time.
Python
10
star
21

pyparsing-highlighting

Syntax highlighting for prompt_toolkit and HTML with pyparsing.
Python
9
star
22

aiohttp_index

aiohttp.web middleware to serve index files (e.g. index.html) when static directories are requested.
Python
8
star
23

rope-flax

Rotary Position Embedding for Flax
Python
4
star
24

ucs

Implements the CAM02-UCS (Luo et al. (2006)) forward transform.
Python
4
star
25

philips-hue

A CLI tool to interface with Philips Hue lights.
Python
4
star
26

dice-mc

DiCE: The Infinitely Differentiable Monte-Carlo Estimator
Python
3
star
27

average

Exponentially weighted moving averages with initialization bias correction.
Python
3
star
28

synthraw

Synthesizes camera raw files
Python
2
star
29

huething

This is a work in progress to control my four Philips Hue bulbs.
Python
2
star
30

base58

Package base58 implements base58 encoding as used in Bitcoin addresses.
Go
2
star
31

crowsonkb.github.io

HTML
2
star
32

websynth

pNaCl-Csound based softsynth
JavaScript
2
star
33

color_schemer

A web application to translate color schemes between dark- and light-background.
Python
1
star
34

cluster

Package cluster performs hierarchical clustering of term vectors.
Go
1
star
35

gradient-maker3

A web application to generate color gradients using the CAM02-UCS colorspace.
Python
1
star
36

scihub-lookup

A Safari extension to look up the current page on Sci-Hub
JavaScript
1
star
37

fragments

Miscellaneous useful, reusable code fragments
Python
1
star
38

.zsh

My zsh configuration
Shell
1
star
39

randomness

Generates random secrets (passwords, etc).
Python
1
star