• Stars
    star
    496
  • Rank 88,807 (Top 2 %)
  • Language
    Swift
  • License
    MIT License
  • Created about 2 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Native Stable Diffusion inference on iOS / macOS using MPSGraph

Native Diffusion Swift Package

Join us on Discord

Native Diffusion runs Stable Diffusion models locally on macOS / iOS devices, in Swift, using the MPSGraph framework (not Python).

This is the Swift Package Manager wrapper of Maple Diffusion. It adds image-to-image, Swift Package Manager package, and convenient ways to use the code, like Combine publishers and async/await versions. It also supports downloading weights from any local or remote URL, including the app bundle itself.

Would not be possible without

  • @madebyollin who wrote the Metal Performance Shader Graph pipeline
  • @GuiyeC who wrote the image-to-image implementation

Features

Get started in 10 minutes

  • Extremely simple API. Generate an image in one line of code.

Make it do what you want

  • Flexible API. Pass in prompt, guidance scale, steps, seed, and an image.
  • One-off conversion script from .ckpt to Native Diffusion's own memory-optimized format
  • Supports Dreambooth models.

Built to be fun to code with

  • Supports async/await, Combine publisher and classic callbacks.
  • Optimized for SwiftUI, but can be used in any kind of project, including command line, UIKit, or AppKit

Built for end-user speed and great user experience

  • 100% native. No Python, no environments, your user don't need to install anything first.
  • Model download built in. Point it to a web address with the model files in a zip archive. The package will download and install the model for later use.
  • As fast or faster than a server in the cloud on newer Macs

Commercial use allowed

  • MIT Licensed (code). We'd love attribution, but it's not needed legally.
  • Generated images are licensed under the CreativeML Open RAIL-M license, meaning you can use the images for virtually anything, including commercial use.

Usage

One-line diffusion

In its simplest form it's as simple as one line:

let image = try? await Diffusion.generate(localOrRemote: modelUrl, prompt: "cat astronaut")

You can give it a local or remote URL or both. If remote, the downloaded weights are saved for later.

The single line version is currently limited in terms of parameters.

See examples/SingleLineDiffusion for a working example.

As an observable object

Let's add some UI. Here's an entire working image generator app in a single SwiftUI view:

GIF demo

struct ContentView: View {
    
    // 1
    @StateObject var sd = Diffusion()
    @State var prompt = ""
    @State var image : CGImage?
    @State var imagePublisher = Diffusion.placeholderPublisher
    @State var progress : Double = 0
    
    var anyProgress : Double { sd.loadingProgress < 1 ? sd.loadingProgress : progress }

    var body: some View {
        VStack {
            
            DiffusionImage(image: $image, progress: $progress)
            Spacer()
            TextField("Prompt", text: $prompt)
            // 3
                .onSubmit { self.imagePublisher = sd.generate(prompt: prompt) }
                .disabled(!sd.isModelReady)
            ProgressView(value: anyProgress)
                .opacity(anyProgress == 1 || anyProgress == 0 ? 0 : 1)
        }
        .task {
            // 2
            let path = URL(string: "http://localhost:8080/Diffusion.zip")!
            try! await sd.prepModels(remoteURL: path)
        }
        
        // 4
        .onReceive(imagePublisher) { r in
            self.image = r.image
            self.progress = r.progress
        }
        .frame(minWidth: 200, minHeight: 200)
    }
}

Here's what it does

  1. Instantiate a Diffusion object
  2. Prepare the models, download if needed
  3. Submit a prompt for generation
  4. Receive updates during generation

See examples/SimpleDiffusion for a working example.

DiffusionImage

An optional SwiftUI view that is specialized for diffusion:

  • Receives drag and drop of an image from e.g. Finder and sends it back to you via a binding (macOS)
  • Automatically resizes the image to 512x512 (macOS)
  • Lets users drag the image to Finder or other apps (macOS)
  • Blurs the internmediate image while generating (macOS and iOS)

Install

Add https://github.com/mortenjust/native-diffusion in the "Swift Package Manager" tab in Xcode

Preparing the weights

Native Diffusion splits the weights into a binary format that is different from the typical CKPT format. It uses many small files which it then (optionally) swaps in and out of memory, enabling it to run on both macOS and iOS. You can use the converter script in the package to convert your own CKPT file.

Option 1: Pre-converted Standard Stable Diffusion v1.5

By downloading this zip file, you accept the creative license from StabilityAI.

Download ZIP. Please don't use this URL in your software.

We'll get back to what to do with it in a second.

Option 2: Preparing your own ckpt file

If you want to use your own CKPT file (like a Dreambooth fine-tuning), you can convert it into Maple Diffusion format
  1. Download a Stable Diffusion model checkpoint to a folder, e.g. ~/Downloads/sd (sd-v1-5.ckpt, or some derivation thereof)

  2. Setup & install Python with PyTorch, if you haven't already.

# Grab the converter script
cd ~/Downloads/sd
curl https://raw.githubusercontent.com/mortenjust/maple-diffusion/main/Converter%20Script/maple-convert.py > maple-convert.py

# may need to install conda first https://github.com/conda-forge/miniforge#homebrew
conda deactivate
conda remove -n native-diffusion --all
conda create -n native-diffusion python=3.10
conda activate native-diffusion
pip install torch typing_extensions numpy Pillow requests pytorch_lightning
./native-convert.py ~/Downloads/sd-v1-4.ckpt

The script will create a new folder called bins. We'll get back to what to do with it in a second.

FAQ

Can I use a Dreambooth model?

Yes. Just copy the alpha* files from the standard conversion. This repo will include these files in the future. See this issue.

Does it support image to image prompting?

Yes. Simply pass in an initImage to your SampleInput when generating.

It crashes

You may need to regenerate the model files with the python script in the repo. This happens if you converted your ckpt model file before we added image2image.

Can I contribute? What's next?

Yes! A rough roadmap:

  • Stable Diffusion 2.0: - new larger output images, upscaling, depth-to-image
  • Add in-painting and out-painting
  • Generate other sizes and aspects than 512x512
  • Upscaling
  • Dreambooth training on-device
  • Tighten up code quality overall. Most is proof of concept.
  • Add image-to-image

See Issues for smaller contributions.

If you're making changes to the MPSGraph part of the codebase, consider making your contributions to the single-file repo and then integrate the changes in the wrapped file in this repo.

How fast is it?

On my MacBook Pro M1 Max, I get ~0.3s/step, which is significantly faster than any Python/PyTorch/Tensorflow installation I've tried.

On an iPhone it should take a minute or two.

To attain usable performance without tripping over iOS's 4GB memory limit, Native Diffusion relies internally on FP16 (NHWC) tensors, operator fusion from MPSGraph, and a truly pitiable degree of swapping models to device storage.

Does it support Stable Diffusion 2.0?

Not yet. Would love some help on this. See above.

I have a question, comment or suggestion

Feel free to post an issue!

More Repositories

1

androidtool-mac

One-click screenshots, video recordings, app installation for iOS and Android
Swift
5,429
star
2

cleartext-mac

A text editor that will help you write clearer and simpler
Swift
3,281
star
3

droptogif

Zero-click animated Gifs
Swift
2,695
star
4

PocketCastsOSX

An unoffical OSX wrapper for Pocketcasts
Objective-C
343
star
5

awesome-conversational

179
star
6

trainer-mac

Trains a model, then generates a complete Xcode project that uses it - no code necessary
Swift
125
star
7

webster-mac

XSLT
80
star
8

tensorswift-ios

Using Tensorflow from Swift
Swift
43
star
9

simpler-thesaurus

A thesaurus that only returns words that are among the 1,000 most used in English
Swift
36
star
10

SwiftUIWindow

Swift
28
star
11

ar-currency-converter-ios

Swift
27
star
12

photometer

Use the iPhone camera to measure time and distances
Swift
25
star
13

springdamping

Realtime experimentation with duration, spring and damping for UIView animations
Swift
18
star
14

Scrollshape

Swift
17
star
15

find-for-kindle

Swift
13
star
16

VideoToggle-SwiftUI

A ToggleStyle for SwiftUI that shows live streaming video
Swift
11
star
17

CodeHighlighter

A SwiftUI view for code syntax highlighting
Swift
11
star
18

podcasts-androidtv

Podcast client for Android TV
Java
11
star
19

Blurhash-macos

Swift
9
star
20

simplify-gmail-safari

JavaScript
9
star
21

image-retrainer-tensorflow

Train Tensorflow model with videos
Shell
7
star
22

skins-wear

Better looking emulator skins for your Wear demos
7
star
23

freewriter-mac

Swift
7
star
24

milliseconds

Simple milliseconds timer for UX designers
Swift
6
star
25

ballpit-mac

A physics playground for spicing up slide decks
Swift
5
star
26

trajectoryclock-wear

A watch face that shows when you'll be home if you leave now
Java
5
star
27

trump-keyboard

Swift
5
star
28

wifirobot

The robot remembers to turn your wifi back on
Java
5
star
29

one-button

What if your tv remote control had only one button?
Java
5
star
30

push-play-ios

A simple radio player for my bathroom that is controlled by gently pushing the entire device
Swift
4
star
31

timeline

3
star
32

notification-maker-android

Notification spoofing for UI designers and testers
Java
3
star
33

twinjack-mac

Swift
3
star
34

subtimeline

Read an entire movie in real time on your wrist
Swift
3
star
35

functional-markdown-vscode

JavaScript
2
star
36

TimeSlider

Swift
2
star
37

JSO

One-liner JSON calls for Swift
Swift
2
star
38

casthunt

Product Hunt's Podcast category as a Podcast feed
PHP
2
star
39

popground-mac

Experiment in real-time with Facebook POP animations
Swift
2
star
40

scroll-video-nextjs-ts-tw

TypeScript
2
star
41

Capture-device

Swift
1
star
42

ar-kitchen-timer

Swift
1
star
43

run-game

Run game
JavaScript
1
star
44

ar-book-wrapper

Swift
1
star
45

twinjack-hybrid

Swift
1
star
46

nocturnal-traffic

Visualization of live traffic on highway 101
Swift
1
star
47

layerfiller-sketch

Quickly add a bunch of different designs to a bunch of different device frames
1
star
48

MJPOPAnim

One-line Facebook POP layer animations
Swift
1
star
49

Device-Recording-Bug-Demo

Swift
1
star
50

bemyeyes-website

The website
JavaScript
1
star
51

hevc-with-alpha

Swift
1
star
52

huestone

Huestone, we have a problem
Java
1
star
53

mortenjust-blog

1
star