• Stars
    star
    330
  • Rank 127,046 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

BentoDiffusion: A collection of diffusion models served with BentoML

🖼️ OneDiffusion

pypi_status Twitter Discord

OneDiffusion is an open-source one-stop shop for facilitating the deployment of any diffusion models in production. It caters specifically to the needs of diffusion models, supporting both pretrained and fine-tuned diffusion models with LoRA adapters.

Key features include:

  • 🌐 Broad compatibility: Support both pretrained and LoRA-adapted diffusion models, providing flexibility in choosing and deploying the appropriate model for various image generation tasks.
  • 💪 Optimized performance and scalability: Automatically select the best optimizations like half-precision weights or xFormers to achieve best inference speed out of the box.
  • ⌛️ Dynamic LoRA adapter loading: Dynamically load and unload LoRA adapters on every request, providing greater adaptability and ensuring the models remain responsive to changing inputs and conditions.
  • 🍱 First-class support for BentoML: Seamless integration with the BentoML ecosystem, allowing you to build Bentos and push them to BentoCloud.

OneDiffusion is designed for AI application developers who require a robust and flexible platform for deploying diffusion models in production. The platform offers tools and features to fine-tune, serve, deploy, and monitor these models effectively, streamlining the end-to-end workflow for diffusion model deployment.

Supported models

Currently, OneDiffusion supports the following models:

  • Stable Diffusion v1.4, v1.5 and v2.0
  • Stable Diffusion XL v1.0
  • Stable Diffusion XL Turbo

More models (for example, ControlNet and DeepFloyd IF) will be added soon.

Note

If you want to deploy Stable Video Diffusion, see the project BentoSVD.

Get started

To quickly get started with OneDiffusion, follow the instructions below or try this tutorial in Google Colab: Serving Stable Diffusion with OneDiffusion.

Prerequisites

You have installed Python 3.8 (or later) and pip.

Install OneDiffusion

Install OneDiffusion by using pip as follows:

pip install onediffusion

To verify the installation, run:

$ onediffusion -h

Usage: onediffusion [OPTIONS] COMMAND [ARGS]...

       ██████╗ ███╗   ██╗███████╗██████╗ ██╗███████╗███████╗██╗   ██╗███████╗██╗ ██████╗ ███╗   ██╗
      ██╔═══██╗████╗  ██║██╔════╝██╔══██╗██║██╔════╝██╔════╝██║   ██║██╔════╝██║██╔═══██╗████╗  ██║
      ██║   ██║██╔██╗ ██║█████╗  ██║  ██║██║█████╗  █████╗  ██║   ██║███████╗██║██║   ██║██╔██╗ ██║
      ██║   ██║██║╚██╗██║██╔══╝  ██║  ██║██║██╔══╝  ██╔══╝  ██║   ██║╚════██║██║██║   ██║██║╚██╗██║
      ╚██████╔╝██║ ╚████║███████╗██████╔╝██║██║     ██║     ╚██████╔╝███████║██║╚██████╔╝██║ ╚████║
       ╚═════╝ ╚═╝  ╚═══╝╚══════╝╚═════╝ ╚═╝╚═╝     ╚═╝      ╚═════╝ ╚══════╝╚═╝ ╚═════╝ ╚═╝  ╚═══╝
          
          An open platform for operating diffusion models in production.
          Fine-tune, serve, deploy, and monitor any diffusion models with ease.
          

Options:
  -v, --version  Show the version and exit.
  -h, --help     Show this message and exit.

Commands:
  build     Package a given model into a Bento.
  download  Setup diffusion models interactively.
  start     Start any diffusion models as a REST server.

Start a diffusion server

OneDiffusion allows you to quickly spin up any diffusion models. To start a server, run:

onediffusion start stable-diffusion

This starts a server at http://0.0.0.0:3000/. You can interact with it by visiting the web UI or send a request via curl.

curl -X 'POST' \
  'http://0.0.0.0:3000/text2img' \
  -H 'accept: image/jpeg' \
  -H 'Content-Type: application/json' \
  --output output.jpg \
  -d '{
  "prompt": "a bento box",
  "negative_prompt": null,
  "height": 768,
  "width": 768,
  "num_inference_steps": 50,
  "guidance_scale": 7.5,
  "eta": 0
}'

By default, OneDiffusion uses stabilityai/stable-diffusion-2 to start the server. To use a specific model version, add the --model-id option as below:

onediffusion start stable-diffusion --model-id runwayml/stable-diffusion-v1-5

To specify another pipeline, use the --pipeline option as below. The img2img pipeline allows you to modify images based on a given prompt and image.

onediffusion start stable-diffusion --pipeline "img2img"

OneDiffusion downloads the models to the BentoML local Model Store if they have not been registered before. To view your models, install BentoML first with pip install bentoml and then run:

$ bentoml models list

Tag                                                                                         Module                              Size        Creation Time
pt-sd-stabilityai--stable-diffusion-2:1e128c8891e52218b74cde8f26dbfc701cb99d79              bentoml.diffusers                   4.81 GiB    2023-08-16 17:52:33
pt-sdxl-stabilityai--stable-diffusion-xl-base-1.0:bf714989e22c57ddc1c453bf74dab4521acb81d8  bentoml.diffusers                   13.24 GiB   2023-08-16 16:09:01

Start a Stable Diffusion XL server

OneDiffusion also supports running Stable Diffusion XL 1.0, the most advanced development in the Stable Diffusion text-to-image suite of models launched by Stability AI. To start an XL server, simply run:

onediffusion start stable-diffusion-xl

It downloads the model automatically if it does not exist locally. Options such as --model-id are also supported. For more information, run onediffusion start stable-diffusion-xl --help.

Similarly, visit http://0.0.0.0:3000/ or send a request via curl to interact with the XL server. Example prompt:

{
  "prompt": "the scene is a picturesque environment with beautiful flowers and trees. In the center, there is a small cat. The cat is shown with its chin being scratched. It is crouched down peacefully. The cat's eyes are filled with excitement and satisfaction as it uses its small paws to hold onto the food, emitting a content purring sound.",
  "negative_prompt": null,
  "height": 1024,
  "width": 1024,
  "num_inference_steps": 50,
  "guidance_scale": 7.5,
  "eta": 0
}

Example output:

sdxl-cat

Start a Stable Diffusion XL Turbo server

SDXL Turbo is a distilled version of SDXL 1.0 and is capable of creating images in a single step, with improved real-time text-to-image output quality and sampling fidelity.

To serve SDXL Turbo locally, run:

onediffusion start stable-diffusion-xl --model-id stabilityai/sdxl-turbo

Visit http://0.0.0.0:3000/ or send a request via curl to interact with the server. Example prompt:

{
  "prompt": "Create a serene landscape at sunset, with a tranquil lake reflecting the vibrant colors of the sky. Surrounding the lake are lush, green forests and distant mountains.",
  "height": 512,
  "width": 512,
  "num_inference_steps": 1,
  "guidance_scale": 0.0
}

Note

SDXL Turbo can run inference with only one step, so you can set num_inference_steps to 1 and this is enough to generate high quality images. However, increasing the number of steps to 2, 3 or 4 should improve image quality. In addition, make sure you set guidance_scale to 0.0 to disable it as the model was trained without it. See the official release notes to learn more.

Example output:

sdxl-turbo-output

Add LoRA weights

Low-Rank Adaptation (LoRA) is a training method to fine-tune models without the need to retrain all parameters. You can add LoRA weights to your diffusion models for specific data needs.

Add the --lora-weights option as below:

onediffusion start stable-diffusion-xl --lora-weights "/path/to/lora-weights.safetensors"

Alternatively, dynamically load LoRA weights by adding the lora_weights field:

{
  "prompt": "the scene is a picturesque environment with beautiful flowers and trees. In the center, there is a small cat. The cat is shown with its chin being scratched. It is crouched down peacefully. The cat's eyes are filled with excitement and satisfaction as it uses its small paws to hold onto the food, emitting a content purring sound.",
  "negative_prompt": null,
  "height": 1024,
  "width": 1024,
  "num_inference_steps": 50,
  "guidance_scale": 7.5,
  "eta": 0,
  "lora_weights": "/path/to/lora-weights.safetensors"
}

By specifying the path of LoRA weights at runtime, you can influence model outputs dynamically. Even with identical prompts, the application of different LoRA weights can yield vastly different results. Example output (oil painting vs. pixel):

dynamic loading

Download a model

If you want to download a diffusion model without starting a server, use the onediffusion download command. For example:

onediffusion download stable-diffusion --model-id "CompVis/stable-diffusion-v1-4"

Create a BentoML Runner

You can create a BentoML Runner with diffusers_simple.stable_diffusion.create_runner(), which downloads the model specified automatically if it does not exist locally.

import bentoml

# Create a Runner for a Stable Diffusion model
runner = bentoml.diffusers_simple.stable_diffusion.create_runner("CompVis/stable-diffusion-v1-4")

# Create a Runner for a Stable Diffusion XL model
runner_xl = bentoml.diffusers_simple.stable_diffusion_xl.create_runner("stabilityai/stable-diffusion-xl-base-1.0")

You can then wrap the Runner into a BentoML Service. See the BentoML documentation for more details.

Build a Bento

A Bento in BentoML is a deployable artifact with all the source code, models, data files, and dependency configurations. You can build a Bento for a supported diffusion model directly by running onediffusion build.

# Build a Bento with a Stable Diffusion model 
onediffusion build stable-diffusion

# Build a Bento with a Stable Diffusion XL model 
onediffusion build stable-diffusion-xl

To specify the model to be packaged into the Bento, use --model-id. Otherwise, OneDiffusion packages the default model into the Bento. If the model does not exist locally, OneDiffusion downloads the model automatically. In addition, the pipeline to use can also be specified through --pipeline. By default, OneDiffusion uses the text2image pipeline.

To package LoRA weights into the Bento, use the --lora-dir option to specify the directory where LoRA files are stored. These files can be dynamically loaded to the model when deployed with Docker or BentoCloud to create task-specific images.

onediffusion build stable-diffusion-xl --lora-dir "/path/to/lorafiles/dir/"

If you only have a single LoRA file to use, run the following instead:

onediffusion build stable-diffusion-xl --lora-weights "/path/to/lorafile"

Each Bento has a BENTO_TAG containing both the Bento name and the version. To customize it, specify --name and --version options.

onediffusion build stable-diffusion-xl --name sdxl --version v1

Once your Bento is ready, log in to BentoCloud and run the following command to push the Bento.

bentoml push BENTO_TAG

Alternatively, create a Docker image by containerizing the Bento with the following command. You can retrieve the BENTO_TAG by running bentoml list.

bentoml containerize BENTO_TAG

You can then deploy the image to any Docker-compatible environments.

Roadmap

We are working to improve OneDiffusion in the following ways and invite anyone who is interested in the project to participate 🤝.

  • Support more models, such as ControlNet and DeepFloyd IF
  • Support more pipelines, such as inpainting
  • Add a Python API client to interact with diffusion models
  • Implement advanced optimization like AITemplate
  • Offer a unified fine-tuning training API

Contribution

We weclome contributions of all kinds to the OneDiffusion project! Check out the following resources to start your OneDiffusion journey and stay tuned for more announcements about OneDiffusion and BentoML.

More Repositories

1

OpenLLM

Run any open-source LLMs, such as Llama 3.1, Gemma, as OpenAI compatible API endpoint in the cloud.
Python
9,688
star
2

BentoML

The easiest way to serve AI apps and models - Build reliable Inference APIs, LLM apps, Multi-model chains, RAG service, and much more!
Python
6,947
star
3

Yatai

Model Deployment at Scale on Kubernetes 🦄️
TypeScript
784
star
4

stable-diffusion-server

Deploy Your Own Stable Diffusion Service
Python
195
star
5

bentoctl

Fast model deployment on any cloud 🚀
Python
175
star
6

gallery

BentoML Example Projects 🎨
Python
134
star
7

BentoVLLM

Self-host LLMs with vLLM and BentoML
Python
54
star
8

OCR-as-a-Service

Turn any OCR models into online inference API endpoint 🚀 🌖
Python
48
star
9

CLIP-API-service

CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search
Jupyter Notebook
44
star
10

transformers-nlp-service

Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more
Python
43
star
11

llm-bench

Python
24
star
12

simple_di

Simple dependency injection framework for Python
Python
20
star
13

BentoChatTTS

Python
20
star
14

rag-tutorials

a series of tutorials implementing rag service with BentoML and LlamaIndex
Python
18
star
15

yatai-deployment

🚀 Launching Bento in a Kubernetes cluster
Go
16
star
16

Fraud-Detection-Model-Serving

Online model serving with Fraud Detection model trained with XGBoost on IEEE-CIS dataset
Jupyter Notebook
14
star
17

google-cloud-run-deploy

Fast model deployment on Google Cloud Run
Python
14
star
18

aws-sagemaker-deploy

Fast model deployment on AWS Sagemaker
Python
14
star
19

yatai-image-builder

🐳 Build OCI images for Bentos in k8s
Go
14
star
20

sentence-embedding-bento

Sentence Embedding as a Service
Jupyter Notebook
14
star
21

aws-lambda-deploy

Fast model deployment on AWS Lambda
Python
13
star
22

aws-ec2-deploy

Fast model deployment on AWS EC2
Python
13
star
23

BentoLMDeploy

Self-host LLMs with LMDeploy and BentoML
Python
12
star
24

IF-multi-GPUs-demo

Python
12
star
25

BentoSVD

Python
10
star
26

diffusers-examples

API serving for your diffusers models
Python
10
star
27

openllm-models

HTML
8
star
28

Pneumonia-Detection-Demo

Pneumonia Detection - Healthcare Imaging Application built with BentoML and fine-tuned Vision Transformer (ViT) model
Python
8
star
29

BentoWhisperX

Python
7
star
30

yatai-chart

Helm Chart for installing Yatai on Kubernetes ⎈
Mustache
7
star
31

benchmark

BentoML Performance Benchmark 🆚
Jupyter Notebook
7
star
32

BentoCLIP

building a CLIP application using BentoML
Python
7
star
33

plugins

the swish knife to all things bentoml.
Starlark
6
star
34

bentoctl-operator-template

Python
6
star
35

heroku-deploy

Deploy BentoML bundled models to Heroku
Python
6
star
36

BentoSentenceTransformers

how to build a sentence embedding application using BentoML
Python
5
star
37

bentoml-core

Rust
5
star
38

quickstart

BentoML Quickstart Example
Python
5
star
39

BentoControlNet

Python
4
star
40

BentoYolo

BentoML service of YOLO v8
Python
4
star
41

google-compute-engine-deploy

HCL
4
star
42

BentoRAG

Tutorial: Build RAG Apps with Custom Models Served with BentoML
Python
4
star
43

containerize-push-action

docker's build-and-push-action equivalent for bentoml
TypeScript
4
star
44

BentoTRTLLM

Python
3
star
45

BentoBLIP

how to build an image captioning application on top of a BLIP model with BentoML
Python
3
star
46

deploy-bento-action

A GitHub Action to deploy bento to cloud
3
star
47

azure-functions-deploy

Fast model deployment on Azure Functions
Python
3
star
48

BentoBark

Python
3
star
49

azure-container-instances-deploy

Fast model deployment on Azure container instances
Python
3
star
50

BentoXTTS

how to build an text-to-speech application using BentoML
Python
3
star
51

BentoResnet

Python
2
star
52

bentoml-arize-fraud-detection-workshop

Jupyter Notebook
2
star
53

BentoSDXLTurbo

how to build an image generation application using BentoML
Python
2
star
54

yatai-schemas

Go
1
star
55

bentoctl-workshops

Python
1
star
56

bentocloud-homepage-news

1
star
57

yatai-common

Go
1
star
58

BentoMoirai

Python
1
star
59

.github

✨🍱🦄️
1
star
60

BentoMLCLLM

Python
1
star
61

BentoTGI

Python
1
star
62

openllm-benchmark

Python
1
star