• Stars
    star
    107
  • Rank 312,566 (Top 7 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created over 1 year ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

showing various ways to serve Keras based stable diffusion

Various ways of serving Stable Diffusion

This repository shows a various ways to deploy Stable Diffusion. Currently, we are interested in the Stable Diffusion implementation from keras-cv, and the target platforms/frameworks that we aim includes TF Serving, Hugging Face Endpoint, and FastAPI.

From the version 0.4.0 release of keras-cv, StableDiffusionV2 is included, and this repository support both version 1 and 2 of the Stable Diffusion.

1. All in One Endpoint

This method shows how to deploy Stable Diffusion as a whole in a single endpoint. Stable Diffusion consists of three models(encoder, diffusion model, decoder) and some glue codes to handle the inputs and outputs of each models. In this scenario, everything is packaged into a single Endpoint.

  • Hugging Face πŸ€— Endpoint: In order to deploy something in Hugging Face Endpoint, we need to create a custom handler. Hugging Face Endpoint let us easily deploy any machine learning models with pre/post processing logics in a custom handler [Colab | Standalone Codebase]

  • FastAPI Endpoint: [Colab | Standalone]

    • Docker Image: gcr.io/gcp-ml-172005/sd-fastapi-allinone:latest

2. Three Endpoints

This method shows how to deploy Stable Diffusion in three separate Endpoints. As a preliminary work, this notebook was written to demonstrate how to split three parts of Stable Diffusion into three separate modules. In this example, you will see how to interact with three different endpoints to generate images with a given text prompt.

  • Hugging Face Endpoint: [Colab | Text Encoder | Diffusion Model | Decoder]

  • FastAPI Endpoint: [Central | Text Encoder | Diffusion Model | Decoder]

    • Docker Image(text-encoder): gcr.io/gcp-ml-172005/sd-fastapi-text-encoder:latest
    • Docker Image(diffusion-model): gcr.io/gcp-ml-172005/sd-fastapi-diffusion-model:latest
    • Docker Image(decoder): gcr.io/gcp-ml-172005/sd-fastapi-decoder:latest
  • TF Serving Endpoint: [Colab | Dockerfiles + k8s Resources]

    • SavedModel: [Colab | Text Encoder | Diffusion Model | Decoder]
      • wrapping encoder, diffusion model, and decoder and some glue codes in separate SavedModels. With them, we can not only deploy each models on cloud with TF Serving but also embed in web and mobild applications with TFJS and TFLite. We will explore the embedded use cases later phase of this project.
    • Docker Images
      • text-encoder: gcr.io/gcp-ml-172005/tfs-sd-text-encoder:latest
      • text-encoder w/ base64: gcr.io/gcp-ml-172005/tfs-sd-text-encoder-base64:latest
      • text-encoder-v2: gcr.io/gcp-ml-172005/tfs-sd-text-encoder-v2:latest
      • text-encoder-v2 w/ base64: gcr.io/gcp-ml-172005/tfs-sd-text-encoder-v2-base64:latest
      • diffusion-model: gcr.io/gcp-ml-172005/tfs-sd-diffusion-model:latest
      • diffusion-model w/ base64: gcr.io/gcp-ml-172005/tfs-sd-diffusion-model-base64:latest
      • diffusion-model-v2: gcr.io/gcp-ml-172005/tfs-sd-diffusion-model-v2:latest
      • diffusion-model-v2 w/ base64: gcr.io/gcp-ml-172005/tfs-sd-diffusion-model-v2-base64:latest
      • decoder: gcr.io/gcp-ml-172005/tfs-sd-decoder:latest
      • decoder w/ base64: gcr.io/gcp-ml-172005/tfs-sd-decoder-base64:latest

NOTE: Passing intermediate values between models through network could be costly, and some platform limits certain payload size. For instance, Vertex AI limits the request size to 1.5MB. To this end, we provide different TF Serving Docker images which handles inputs and produces outputs in base64 format.

3. One Endpoint with Two local APIs (w/ πŸ€— Endpoint)

With the separation of Stable Diffusion, we could organize each parts in any environments. This is powerful especially if we want to deploy specialized diffusion models such as inpainting and finetuned diffusion model. In this case, we only need to replace the currently deployed diffusion model or just deploy a new diffusion model besides while keeping the other two(text encoder and decoder) as is.

Also, it is worth noting that we could run text encoder and decoder parts in local(Python clients or web/mobile with TF Serving) while having diffusion model on cloud. In this repository, we currently show an example using Hugging Face πŸ€— Endpoint. However, you could easily expand the posibilities.

NOTE: along with this project, we have developed one more project to fine-tune Keras based Stable Diffusion at Fine-tuning Stable Diffusion using Keras. We currently provide a fine-tuned model to Pokemon dataset.

  • Original txt2img generation: [Colab]

  • Original inpainting: [Colab]

4. On-Device Deployment (w/ TFLite) - WIP

We have managed to convert SavedModels into TFLite models, and we are hosting them as below (thanks to @farmaker47):

These TFLite models have the same signature as the SavedModels, and all the pre/post operations are included inside. All of them are converted with float16 quantization optimize process. You can find more about how to convert SavedModels to TFLite models in this repository.

TODO

  • Implement SimpleTokenizer in JAVA and JavaScript
  • Run TFLite models on Android and Web browser

Timing Tests

details

Sequential

The figure below shows how long each scenario took from text encoding to diffusion to decoding. It assumes each request(batch_size=4) is handled sequentially with a single server running on Hugging Face Endpoint for each endpoint. all-in-one endpoint deployed the Stable Diffusion on A10 equipped server while separate endpoints deployed text encoder on 2 vCPU + 4GB RAM, diffusion model on A10 equipped server, and decoder on T4 equipped server. Finally, one endpoint, two local only deployed difusion model on A10 equipped server while keeping the other two on Colab environment (w/ T4). Please take a look how these are measured from this notebook

🚨 XLA support

In this notebook, we show how we can XLA-compile the SavedModels to achieve a speed-up of about 52% over the non-XLA variant.

Acknowledgements

Thanks to the ML Developer Programs' team at Google for providing GCP credits.

More Repositories

1

LLM-As-Chatbot

LLM as a Chatbot Service
Python
3,210
star
2

Machine-Learning-Yearning-Korean-Translation

Korean translation of machine learning yearning book by Andrew Ng.
360
star
3

CIFAR10-img-classification-tensorflow

image classification with CIFAR10 dataset w/ Tensorflow
Jupyter Notebook
132
star
4

ml-deployment-k8s-tfserving

This project shows how to serve an TF based image classification model as a web service with TFServing, Docker, and Kubernetes(GKE).
Jupyter Notebook
118
star
5

mlops-hf-tf-vision-models

MLOps for Vision Models (TensorFlow) from πŸ€— Transformers with TensorFlow Extended (TFX)
Jupyter Notebook
113
star
6

Soccer-Ball-Detection-YOLOv2

YOLOv2 trained against custom dataset
Jupyter Notebook
111
star
7

EN-FR-MLT-tensorflow

English-French Machine Language Translation in Tensorflow
HTML
108
star
8

fb-group-post-fetcher

HTML
91
star
9

hf-daily-paper-newsletter

Newsletter bot for πŸ€— Daily Papers
HTML
89
star
10

semantic-segmentation-ml-pipeline

Jupyter Notebook
87
star
11

PingPong

manage histories of LLM applied applications
Python
82
star
12

gradio-chat

HuggingChat like UI in Gradio
Python
59
star
13

fastai-course-korean

korean translation + more examples for fastai course contents
Jupyter Notebook
50
star
14

image_search_with_natural_language

Application for searching images from natural language queries
Jupyter Notebook
42
star
15

DeepModels

TensorFlow Implementation of state-of-the-art models since 2012
Python
38
star
16

LLM-Pref-Mark-UI

Python
37
star
17

AlexNet

AlexNet model from ILSVRC 2012
Jupyter Notebook
35
star
18

gpt2-ft-pipeline

GPT2 fine-tuning pipeline with KerasNLP, TensorFlow, and TensorFlow Extended
Jupyter Notebook
33
star
19

auto-paper-analysis

Jupyter Notebook
33
star
20

segformer-tf-transformers

This repository demonstrates how to use TensorFlow based SegFormer model in πŸ€— transformers package.
Jupyter Notebook
31
star
21

LoRA-deployment

LoRA fine-tuned Stable Diffusion Deployment
Jupyter Notebook
31
star
22

CIFAR10-VGG19-Tensorflow

Jupyter Notebook
29
star
23

Continuous-Adaptation-for-Machine-Learning-System-to-Data-Changes

https://blog.tensorflow.org/2021/12/continuous-adaptation-for-machine.html
Jupyter Notebook
27
star
24

Object-Detection-YOLOv2-Darkflow

Jupyter Notebook
25
star
25

Model-Training-as-a-CI-CD-System

Demonstration of the Model Training as a CI/CD System in Vertex AI
Python
24
star
26

practical-time-series-analysis-korean

Jupyter Notebook
24
star
27

Continuous-Adaptation-with-VertexAI-AutoML-Pipeline

Jupyter Notebook
21
star
28

Vid2Persona

This project breathes life into video characters by using AI to describe their personality and then chat with you as them.
Jupyter Notebook
20
star
29

LLM-Serve

This repository provides a framework to serve LLM(Large Language Model) based applications such as Chatbot.
Python
17
star
30

complete-mlops-system-workflow

Jupyter Notebook
17
star
31

janus

generate synthetic data for LLM fine-tuning in arbitrary situations within systematic way
Jupyter Notebook
15
star
32

TFX-WandB

Jupyter Notebook
14
star
33

deep-diver

HTML
13
star
34

paperqa-ui

Python
12
star
35

LLM-Pool

Python
10
star
36

textual-inversion-pipeline

Python
9
star
37

LLMs-Colab

Python
9
star
38

llmops-pipeline

Jupyter Notebook
6
star
39

personal_newsletter_curation

HTML
5
star
40

portfolio_template

Java
5
star
41

never-leaving-vscode

5
star
42

VGG

VGG models from ILSVRC 2014
Python
4
star
43

pocket-ml-reference-korean

μ£Όλ¨Έλ‹ˆμ† λ¨Έμ‹ λŸ¬λ‹
Jupyter Notebook
4
star
44

hf-hub-utils

3
star
45

object-detection-test

object-detection-test
Jupyter Notebook
3
star
46

deploy-stable-diffusion-tfserving

This repo explores and demonstrates how to deploy stable diffusion model with TF Serving
3
star
47

fastai-course

CSS
3
star
48

Sampling-Distribution-on-Poker-Cards-

2
star
49

Data-Wrangling-on-OpenStreeMap

Jupyter Notebook
2
star
50

llama-keras

Jupyter Notebook
2
star
51

promptengineer

2
star
52

book-tracking-react

Book tracking web-app project in React. This project is one of the requirements to graduate from 'Front End Web Development Nanodegree' @Udacity.
JavaScript
2
star
53

Baseball_Data_Analysis

Exploratory Data Visualization Project on Baseball Data in Tableau
2
star
54

Responsive-Portfolio

HTML
2
star
55

Enron-Data-Analysis

Data Analysis and Machine Learning on Enron Data
HTML
2
star
56

Data-Analysis-on-RedWine

HTML
2
star
57

SD-TFTRT

Jupyter Notebook
2
star
58

rnn_simple

Python
2
star
59

Data-Analysis-on-Titanic

applying data analysis on titanic data sheet
Jupyter Notebook
2
star
60

Linear-Regression

implement simple version of "Linear Regression" using only Numpy
Jupyter Notebook
2
star
61

neighborhood-map-react

neighborhood-map-react
JavaScript
2
star
62

Logistic-Regression

simple neural network without hidden layer
Python
2
star
63

ml-fn-impls

practice implementing functions appearing in machine learning field
Python
2
star
64

tfx-gpu-docker

Dockerfile
1
star
65

YOLO-Impl-Tensorflow

Implementation of YOLO in Tensorflow
Python
1
star
66

calculator

1
star
67

deeplearningbook-korean-translation

experiments on translation of the book deeplearningbook
Jupyter Notebook
1
star
68

genai-apis

Python
1
star
69

paper-code-match

matching between paper and its codes in side-by-side layout
HTML
1
star
70

Python-Machine-Learning-Book-Practice

Python Machine Learning μ±…μ˜ μ†ŒμŠ€μ½”λ“œλ₯Ό μ£Όν”Όν„° λ…ΈνŠΈλΆμ΄ μ•„λ‹Œ, μ†ŒμŠ€μ½”λ“œ ν˜•νƒœλ‘œ μž‘μ„± μ—°μŠ΅
Python
1
star
71

deeplearning-with-structured-data

Jupyter Notebook
1
star
72

test_img_clf

HTML
1
star
73

KaggleNotebook-Notes

Personal notes on some kaggle notebooks publicly available
1
star
74

gitmlops-test1

HTML
1
star
75

dstack-exp

Python
1
star
76

llmtoolbox

hllama is a library which aims to provide a set of utility tools for large language models.
Python
1
star