• Stars
    star
    521
  • Rank 84,952 (Top 2 %)
  • Language
    Rust
  • License
    Apache License 2.0
  • Created about 2 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An implementation of the diffusers api in Rust

diffusers-rs: A Diffusers API in Rust/Torch

Build Status Latest version Documentation License

rusty robot holding a torch

A rusty robot holding a fire torch, generated by stable diffusion using Rust and libtorch.

The diffusers crate is a Rust equivalent to Huggingface's amazing diffusers Python library. It is based on the tch crate. The implementation supports running Stable Diffusion v1.5 and v2.1.

Getting the weights

The weight files can be retrieved from the HuggingFace model repos and should be moved in the data/ directory.

  • For Stable Diffusion v2.1, get the bpe_simple_vocab_16e6.txt, clip_v2.1.safetensors, unet_v2.1.safetensors, and vae_v2.1.safetensors files from the v2.1 repo.
  • For Stable Diffusion v1.5, get the bpe_simple_vocab_16e6.txt, pytorch_model.safetensors, unet.safetensors, and vae.safetensors files from this v1.5 repo.
  • Alternatively, you can run the following python script.
# Add --sd_version 1.5 to get the v1.5 weights rather than the v2.1.
python3 ./scripts/get_weights.py

Running some example.

cargo run --example stable-diffusion --features clap -- --prompt "A rusty robot holding a fire torch."

The final image is named sd_final.png by default. The default scheduler is the Denoising Diffusion Implicit Model scheduler (DDIM). The original paper and some code can be found in the associated repo.

This generates some images of rusty robots holding some torches!

Image to Image Pipeline

The stable diffusion model can also be used to generate an image based on another image. The following command runs this image to image pipeline:

cargo run --example stable-diffusion-img2img --features clap -- --input-image media/in_img2img.jpg

The default prompt is "A fantasy landscape, trending on artstation.", but can be changed via the -prompt flag.

img2img input img2img output

Inpainting Pipeline

Inpainting can be used to modify an existing image based on a prompt and modifying the part of the initial image specified by a mask. This requires different unet weights unet-inpaint.safetensors that could also be retrieved from this repo and should also be placed in the data/ directory.

The following command runs this image to image pipeline:

wget https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo.png -O sd_input.png
wget https://raw.githubusercontent.com/CompVis/latent-diffusion/main/data/inpainting_examples/overture-creations-5sI6fQgYIuo_mask.png -O sd_mask.png
cargo run --example stable-diffusion-inpaint --features clap --input-image sd_input.png --mask-image sd_mask.png

The default prompt is "Face of a yellow cat, high resolution, sitting on a park bench.", but can be changed via the -prompt flag.

inpaint output

ControlNet Pipeline

The ControlNet architecture can be used to control how stable diffusion generate images. This is to be used with the weights for stable diffusion 1.5 (see how to get these above). Additional weights have to be retrieved from this HuggingFace repo and copied in data/controlnet.safetensors.

The ControlNet pipeline takes as input a sample image, in the default mode it will perform edge detection on this image using the Canny edge detector and will use the resulting edge image as a guide.

cargo run --example controlnet --features clap,image,imageproc -- \
  --prompt "a rusty robot, lit by a fire torch, hd, very detailed" \
  --input-image media/vermeer.jpg

The media/vermeer.jpg image is the well known painting on the left hand side, this results in the right hand side image after performing edge detection.

Using only the edge detection image, the ControlNet model generate the following samples.

FAQ

Memory Issues

This requires a GPU with more than 8GB of memory, as a fallback the CPU version can be used but is slower.

cargo run --example stable-diffusion --features clap -- --prompt "A very rusty robot holding a fire torch." --cpu all

For a GPU with 8GB, one can use the fp16 weights for the UNet and put only the UNet on the GPU.

PYTORCH_CUDA_ALLOC_CONF=garbage_collection_threshold:0.6,max_split_size_mb:128 RUST_BACKTRACE=1 CARGO_TARGET_DIR=target2 cargo run \
    --example stable-diffusion --features clap -- --cpu vae --cpu clip \
    --unet-weights data/unet-fp16.safetensors

More Repositories

1

tch-rs

Rust bindings for the C++ api of PyTorch.
Rust
4,162
star
2

ocaml-torch

OCaml bindings for PyTorch
OCaml
412
star
3

tensorflow-ocaml

OCaml bindings for TensorFlow
OCaml
283
star
4

deep-models

Implementation of a couple deep learning models using TensorFlow
Python
145
star
5

mamba.rs

Rust
121
star
6

xla-rs

Experimentation using the xla compiler from rust
Rust
87
star
7

npy-ocaml

Numpy file format support for ocaml.
OCaml
41
star
8

ocaml-arrow

OCaml
34
star
9

ocaml-rust

Safe OCaml-Rust Foreign Function Interface
Rust
34
star
10

ocaml-wasmtime

OCaml WebAssembly runtime powered by Wasmtime
OCaml
34
star
11

tch-ext

Sample Python extension using Rust/PyO3/tch to interact with PyTorch
Rust
31
star
12

ocaml-matplotlib

Plotting for ocaml based on matplotlib.pyplot
OCaml
30
star
13

btc-ocaml

A toy implementation of the bitcoin protocol in ocaml.
OCaml
29
star
14

ocaml-xla

XLA (Accelerated Linear Algebra) bindings for OCaml
OCaml
28
star
15

ocaml-dataframe

Simple and type-safe dataframe api implemented in pure ocaml
OCaml
25
star
16

ocaml-bert

Transformer-based models for Natural Language Processing in OCaml
OCaml
23
star
17

binprot-rs

Bin_prot binary protocols in Rust
Rust
19
star
18

ocaml-onnx

OCaml ONNX runtime powered by onnxruntime
C
18
star
19

ocaml-tqdm

An ocaml progress bar library similar to https://tqdm.github.io
OCaml
17
star
20

rsexp

S-expression parsing and writing in Rust
Rust
17
star
21

sphn

python bindings for symphonia/opus - read various audio formats from python and write opus files
Rust
16
star
22

tboard-rs

Read and write tensorboard data using Rust
Rust
16
star
23

ProjectEuler

Python
15
star
24

glim

Rust
15
star
25

ocaml-minipy

Naive interpreter for a Python like language
OCaml
13
star
26

syncarp

An async rpc implementation based on tokio and compatible with OCaml Async_rpc
Rust
12
star
27

ocaml.jl

Prototype code for some Julia-OCaml bindings
OCaml
12
star
28

ocaml-tensorflow-eager

OCaml bindings for TensorFlow Eager mode
OCaml
11
star
29

wtensor

Experiments around a webgpu based tensor library
Rust
9
star
30

hojo

A small python library to run iterators in a separate process
Rust
9
star
31

LaurentMazare.github.io

JavaScript
8
star
32

timens-rs

Simple and efficient time representation in Rust.
Rust
7
star
33

ocaml-smbus

C
6
star
34

cmt-fun

OCaml
6
star
35

serde-binprot

Rust binprot serialization using serde
Rust
6
star
36

ocaml-jupyter-async

An OCaml kernel for Jupyter using async.
OCaml
5
star
37

jax-flash-attn2

JAX bindings for the flash-attention2 kernels
C++
5
star
38

jax-flash-attn3

JAX bindings for the flash-attention3 kernels
C++
5
star
39

openai-gym-ocaml

OCaml
4
star
40

ocaml-tensorboard

Write tensorboard compatible log files from ocaml
OCaml
4
star
41

ocaml-rust-stubs

OCaml
4
star
42

ocaml-rpi-gpio

ocaml api for raspberry pi gpio access
C
3
star
43

ocaml-gym

Bindings for OpenAI Gym using the Python C API
OCaml
3
star
44

ogg-table

Ogg-vorbis reader with fast random access
Rust
3
star
45

ocaml-rplidar

RPLidar A1M8 ocaml library
OCaml
2
star
46

ocamldate

Very simple ocaml date implementation
OCaml
1
star