• Stars
    star
    973
  • Rank 47,051 (Top 1.0 %)
  • Language
    C++
  • License
    BSD 3-Clause "New...
  • Created about 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Stable Diffusion in NCNN with c++, supported txt2img and img2img

Stable Diffusion-NCNN

Stable-Diffusion implemented by ncnn framework based on C++, supported txt2img and img2img!

Zhihu: https://zhuanlan.zhihu.com/p/582552276

Video: https://www.bilibili.com/video/BV15g411x7Hc

txt2img Performance (time pre-it and ram)

per-it i7-12700 (512x512) i7-12700 (256x256) Snapdragon865 (256x256)
slow 4.85s/5.24G(7.07G) 1.05s/3.58G(4.02G) 1.6s/2.2G(2.6G)
fast 2.85s/9.47G(11.29G) 0.65s/5.76G(6.20G)

News

2023-03-11: happy to add img2img android and release new apk

2023-03-10: happy to add img2img x86

2023-01-19: speed up & less ram in x86, dynamic shape in x86

2023-01-12: update to the latest ncnn code and use optimize model, update android, add memory monitor

2023-01-05: add 256x256 model to x86 project

2023-01-04: merge and finish the mha op in x86, enable fast gelu

Demo

image

Out of box

All models and exe file you can download from 百度网盘 or Google Drive or Release

If you only need ncnn model, you can search it from 硬件模型库-设备专用模型, it would be more faster and free.

x86 Windows

  1. enter folder exe
  2. download 4 bin file: AutoencoderKL-fp16.bin, FrozenCLIPEmbedder-fp16.bin, UNetModel-MHA-fp16.bin, AutoencoderKL-encoder-512-512-fp16.bin and put them to assets folder
  3. set up your config in magic.txt, each line are:
    1. height (must be a multiple of 128, minimum is 256)
    2. width (must be a multiple of 128, minimum is 256)
    3. speed mode (0 is slow but low ram, 1 is fast but high ram)
    4. step number (15 is not bad)
    5. seed number (set 0 to be random)
    6. init image (if the file is exist, run img2img, if not, run txt2img)
    7. positive prompt (describe what you want)
    8. negative prompt (describe what you don't want)
  4. run stable-diffusion.exe

android apk

  1. download an install the apk from the link
  2. in the top, the first one is step and the second one is seed
  3. int the bottom, the top one the positive prompt and the bottom one negative prompt (set empty to enable the default prompt)
  4. note: the apk needs 7G ram, and run very slow and power consumption

Implementation Details

Note: Please comply with the requirements of the SD model and do not use it for illegal purposes

  1. Three main steps of Stable-Diffusion:
    1. CLIP: text-embedding
    2. (only img2img) encode the init image to init latent
    3. iterative sampling with sampler
    4. decode the sampler results to obtain output images
  2. Model details:
    1. Weights:Naifu (u know where to find)
    2. Sampler:Euler ancestral (k-diffusion version)
    3. Resolution:dynamic shape, but must be a multiple of 128, minimum is 256
    4. Denoiser:CFGDenoiser, CompVisDenoiser
    5. Prompt:positive & negative, both supported :)

Code Details

Complie for x86 Windows

  1. download 4 bin file: AutoencoderKL-fp16.bin, FrozenCLIPEmbedder-fp16.bin, UNetModel-MHA-fp16.bin, AutoencoderKL-encoder-512-512-fp16.bin and put them to assets folder
  2. open the vs2019 project and compile the release&x64

Complie for x86 Linux / MacOS

  1. build and Install NCNN
  2. build the demo with CMake
cd x86/linux
mkdir -p build && cd build
cmake ..
make -j$(nproc)
  1. download 3 bin file: AutoencoderKL-fp16.bin, FrozenCLIPEmbedder-fp16.bin, UNetModel-MHA-fp16.bin and put them to build/assets folder
  2. run the demo
./stable-diffusion-ncnn

Compile for android

  1. download three bin file: AutoencoderKL-fp16.bin, FrozenCLIPEmbedder-fp16.bin, UNetModel-MHA-fp16.bin and put them to assets folder
  2. open android studio and run the project

ONNX Model

I've uploaded the three onnx models used by Stable-Diffusion, so that you can do some interesting work.

You can find them from the link above.

Statements

  1. Please abide by the agreement of the stable diffusion model consciously, and DO NOT use it for illegal purposes!
  2. If you use these onnx models to make open source projects, please inform me and I'll follow and look forward for your next great work :)

Instructions

  1. FrozenCLIPEmbedder
ncnn (input & output): token, multiplier, cond, conds
onnx (input & output): onnx::Reshape_0, 2271

z = onnx(onnx::Reshape_0=token)
origin_mean = z.mean()
z *= multiplier
new_mean = z.mean()
z *= origin_mean / new_mean
conds = torch.concat([cond,z], dim=-2)
  1. UNetModel
ncnn (input & output): in0, in1, in2, c_in, c_out, outout
onnx (input & output): x, t, cc, out

outout = in0 + onnx(x=in0 * c_in, t=in1, cc=in2) * c_out

References

  1. ncnn
  2. opencv-mobile
  3. stable-diffusion
  4. k-diffusion
  5. stable-diffusion-webui
  6. diffusers
  7. diffusers-ncnn

More Repositories

1

ClothingTransfer-NCNN

CT-Net, OpenPose, LIP_JPPNet, DensePose running with ncnn⚡服装迁移/虚拟试穿⚡ClothingTransfer/Virtual-Try-On⚡
C++
247
star
2

QRCode-NCNN

QRCode(from WeChat) implement in ncnn⚡二维码检测&解码⚡ncnn⚡
C++
204
star
3

PSGAN-NCNN

PSGAN running with ncnn⚡妆容迁移/仿妆⚡Imitation Makeup/Makeup Transfer⚡
C++
177
star
4

CLIP-ImageSearch-NCNN

CLIP⚡NCNN⚡基于自然语言的图片搜索(Image Search)⚡以字搜图⚡x86⚡Android
C++
177
star
5

GPT2-ChineseChat-NCNN

GPT2⚡NCNN⚡中文对话⚡x86⚡Android
C++
78
star
6

diffusers-ncnn

C++
74
star
7

NeRF-NCNN

NeRF in NCNN with c++ & vulkan
C++
67
star
8

DragGan-NCNN

DragGan in NCNN with c++
C++
45
star
9

YOLOP-NCNN

YOLOP running in Android by ncnn
C++
45
star
10

monodepth-NCNN

monodepth running in Android by ncnn
C++
21
star
11

SID-NCNN

Learning to See in the Dark running in Android by ncnn with Raw Camera
C
21
star
12

Android_learning

C++
19
star
13

whisper-trtllm

Whisper in TensorRT-LLM
C++
14
star
14

PiDiNet-NCNN

PiDiNet running in Android by ncnn
C++
13
star
15

Meeting-Matting

视频会议换背景⚡背景调色
C++
12
star
16

model_zoo

Recording models
11
star
17

ncnn-tnn-mnn-android-demo

ncnn & tnn & mnn 三合一的安卓 Camera & Gallery 工程
C++
9
star
18

OpenCL_learning

Jupyter Notebook
7
star
19

ncnn2pytorch

C++
6
star
20

espnet-trt

Python
5
star
21

cheaperStoreSD

less store for stable diffusion model
C++
4
star
22

Ncnn_Win

C++
3
star
23

fast_uint8_softmax

C++
2
star
24

Paper_Mark

1
star
25

Hey-PowerVR-FP16

分析一下遇到的PowerVR的FP16的计算精度问题
C++
1
star
26

TopK

Cuda
1
star