• Stars
    star
    110
  • Rank 316,770 (Top 7 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created almost 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

showing various ways to serve Keras based stable diffusion

Various ways of serving Stable Diffusion

This repository shows a various ways to deploy Stable Diffusion. Currently, we are interested in the Stable Diffusion implementation from keras-cv, and the target platforms/frameworks that we aim includes TF Serving, Hugging Face Endpoint, and FastAPI.

From the version 0.4.0 release of keras-cv, StableDiffusionV2 is included, and this repository support both version 1 and 2 of the Stable Diffusion.

1. All in One Endpoint

This method shows how to deploy Stable Diffusion as a whole in a single endpoint. Stable Diffusion consists of three models(encoder, diffusion model, decoder) and some glue codes to handle the inputs and outputs of each models. In this scenario, everything is packaged into a single Endpoint.

  • Hugging Face πŸ€— Endpoint: In order to deploy something in Hugging Face Endpoint, we need to create a custom handler. Hugging Face Endpoint let us easily deploy any machine learning models with pre/post processing logics in a custom handler [Colab | Standalone Codebase]

  • FastAPI Endpoint: [Colab | Standalone]

    • Docker Image: gcr.io/gcp-ml-172005/sd-fastapi-allinone:latest

2. Three Endpoints

This method shows how to deploy Stable Diffusion in three separate Endpoints. As a preliminary work, this notebook was written to demonstrate how to split three parts of Stable Diffusion into three separate modules. In this example, you will see how to interact with three different endpoints to generate images with a given text prompt.

  • Hugging Face Endpoint: [Colab | Text Encoder | Diffusion Model | Decoder]

  • FastAPI Endpoint: [Central | Text Encoder | Diffusion Model | Decoder]

    • Docker Image(text-encoder): gcr.io/gcp-ml-172005/sd-fastapi-text-encoder:latest
    • Docker Image(diffusion-model): gcr.io/gcp-ml-172005/sd-fastapi-diffusion-model:latest
    • Docker Image(decoder): gcr.io/gcp-ml-172005/sd-fastapi-decoder:latest
  • TF Serving Endpoint: [Colab | Dockerfiles + k8s Resources]

    • SavedModel: [Colab | Text Encoder | Diffusion Model | Decoder]
      • wrapping encoder, diffusion model, and decoder and some glue codes in separate SavedModels. With them, we can not only deploy each models on cloud with TF Serving but also embed in web and mobild applications with TFJS and TFLite. We will explore the embedded use cases later phase of this project.
    • Docker Images
      • text-encoder: gcr.io/gcp-ml-172005/tfs-sd-text-encoder:latest
      • text-encoder w/ base64: gcr.io/gcp-ml-172005/tfs-sd-text-encoder-base64:latest
      • text-encoder-v2: gcr.io/gcp-ml-172005/tfs-sd-text-encoder-v2:latest
      • text-encoder-v2 w/ base64: gcr.io/gcp-ml-172005/tfs-sd-text-encoder-v2-base64:latest
      • diffusion-model: gcr.io/gcp-ml-172005/tfs-sd-diffusion-model:latest
      • diffusion-model w/ base64: gcr.io/gcp-ml-172005/tfs-sd-diffusion-model-base64:latest
      • diffusion-model-v2: gcr.io/gcp-ml-172005/tfs-sd-diffusion-model-v2:latest
      • diffusion-model-v2 w/ base64: gcr.io/gcp-ml-172005/tfs-sd-diffusion-model-v2-base64:latest
      • decoder: gcr.io/gcp-ml-172005/tfs-sd-decoder:latest
      • decoder w/ base64: gcr.io/gcp-ml-172005/tfs-sd-decoder-base64:latest

NOTE: Passing intermediate values between models through network could be costly, and some platform limits certain payload size. For instance, Vertex AI limits the request size to 1.5MB. To this end, we provide different TF Serving Docker images which handles inputs and produces outputs in base64 format.

3. One Endpoint with Two local APIs (w/ πŸ€— Endpoint)

With the separation of Stable Diffusion, we could organize each parts in any environments. This is powerful especially if we want to deploy specialized diffusion models such as inpainting and finetuned diffusion model. In this case, we only need to replace the currently deployed diffusion model or just deploy a new diffusion model besides while keeping the other two(text encoder and decoder) as is.

Also, it is worth noting that we could run text encoder and decoder parts in local(Python clients or web/mobile with TF Serving) while having diffusion model on cloud. In this repository, we currently show an example using Hugging Face πŸ€— Endpoint. However, you could easily expand the posibilities.

NOTE: along with this project, we have developed one more project to fine-tune Keras based Stable Diffusion at Fine-tuning Stable Diffusion using Keras. We currently provide a fine-tuned model to Pokemon dataset.

  • Original txt2img generation: [Colab]

  • Original inpainting: [Colab]

4. On-Device Deployment (w/ TFLite) - WIP

We have managed to convert SavedModels into TFLite models, and we are hosting them as below (thanks to @farmaker47):

These TFLite models have the same signature as the SavedModels, and all the pre/post operations are included inside. All of them are converted with float16 quantization optimize process. You can find more about how to convert SavedModels to TFLite models in this repository.

TODO

  • Implement SimpleTokenizer in JAVA and JavaScript
  • Run TFLite models on Android and Web browser

Timing Tests

details

Sequential

The figure below shows how long each scenario took from text encoding to diffusion to decoding. It assumes each request(batch_size=4) is handled sequentially with a single server running on Hugging Face Endpoint for each endpoint. all-in-one endpoint deployed the Stable Diffusion on A10 equipped server while separate endpoints deployed text encoder on 2 vCPU + 4GB RAM, diffusion model on A10 equipped server, and decoder on T4 equipped server. Finally, one endpoint, two local only deployed difusion model on A10 equipped server while keeping the other two on Colab environment (w/ T4). Please take a look how these are measured from this notebook

🚨 XLA support

In this notebook, we show how we can XLA-compile the SavedModels to achieve a speed-up of about 52% over the non-XLA variant.

Acknowledgements

Thanks to the ML Developer Programs' team at Google for providing GCP credits.

More Repositories

1

LLM-As-Chatbot

LLM as a Chatbot Service
Python
3,273
star
2

Machine-Learning-Yearning-Korean-Translation

Korean translation of machine learning yearning book by Andrew Ng.
360
star
3

llamaduo

This project showcases an LLMOps pipeline that fine-tunes a small-size LLM model to prepare for the outage of the service LLM.
Jupyter Notebook
231
star
4

CIFAR10-img-classification-tensorflow

image classification with CIFAR10 dataset w/ Tensorflow
Jupyter Notebook
132
star
5

ml-deployment-k8s-tfserving

This project shows how to serve an TF based image classification model as a web service with TFServing, Docker, and Kubernetes(GKE).
Jupyter Notebook
120
star
6

mlops-hf-tf-vision-models

MLOps for Vision Models (TensorFlow) from πŸ€— Transformers with TensorFlow Extended (TFX)
Jupyter Notebook
115
star
7

Soccer-Ball-Detection-YOLOv2

YOLOv2 trained against custom dataset
Jupyter Notebook
111
star
8

EN-FR-MLT-tensorflow

English-French Machine Language Translation in Tensorflow
HTML
108
star
9

hf-daily-paper-newsletter

Newsletter bot for πŸ€— Daily Papers
HTML
100
star
10

fb-group-post-fetcher

HTML
91
star
11

semantic-segmentation-ml-pipeline

Machine Learning Pipeline for Semantic Segmentation with TensorFlow Extended (TFX) and various GCP products
Jupyter Notebook
91
star
12

PingPong

manage histories of LLM applied applications
Python
86
star
13

gradio-chat

HuggingChat like UI in Gradio
Python
63
star
14

fastai-course-korean

korean translation + more examples for fastai course contents
Jupyter Notebook
50
star
15

image_search_with_natural_language

Application for searching images from natural language queries
Jupyter Notebook
45
star
16

Vid2Persona

This project breathes life into video characters by using AI to describe their personality and then chat with you as them.
Jupyter Notebook
44
star
17

DeepModels

TensorFlow Implementation of state-of-the-art models since 2012
Python
38
star
18

LLM-Pref-Mark-UI

Python
37
star
19

AlexNet

AlexNet model from ILSVRC 2012
Jupyter Notebook
35
star
20

auto-paper-analysis

Jupyter Notebook
35
star
21

gpt2-ft-pipeline

GPT2 fine-tuning pipeline with KerasNLP, TensorFlow, and TensorFlow Extended
Jupyter Notebook
33
star
22

segformer-tf-transformers

This repository demonstrates how to use TensorFlow based SegFormer model in πŸ€— transformers package.
Jupyter Notebook
31
star
23

LoRA-deployment

LoRA fine-tuned Stable Diffusion Deployment
Jupyter Notebook
31
star
24

CIFAR10-VGG19-Tensorflow

Jupyter Notebook
29
star
25

Continuous-Adaptation-for-Machine-Learning-System-to-Data-Changes

https://blog.tensorflow.org/2021/12/continuous-adaptation-for-machine.html
Jupyter Notebook
28
star
26

Model-Training-as-a-CI-CD-System

Demonstration of the Model Training as a CI/CD System in Vertex AI
Python
27
star
27

Object-Detection-YOLOv2-Darkflow

Jupyter Notebook
25
star
28

practical-time-series-analysis-korean

Jupyter Notebook
24
star
29

Continuous-Adaptation-with-VertexAI-AutoML-Pipeline

Jupyter Notebook
22
star
30

janus

generate synthetic data for LLM fine-tuning in arbitrary situations within systematic way
Jupyter Notebook
21
star
31

LLM-Serve

This repository provides a framework to serve LLM(Large Language Model) based applications such as Chatbot.
Python
17
star
32

complete-mlops-system-workflow

Jupyter Notebook
17
star
33

TFX-WandB

Jupyter Notebook
14
star
34

deep-diver

HTML
13
star
35

paperqa-ui

Python
13
star
36

LLM-Pool

Python
10
star
37

hllama

hllama is a library which aims to provide a set of utility tools for large language models.
Python
10
star
38

textual-inversion-pipeline

Python
9
star
39

LLMs-Colab

Python
9
star
40

personal_newsletter_curation

HTML
5
star
41

promptengineer

5
star
42

genai-apis

Python
5
star
43

portfolio_template

Java
5
star
44

never-leaving-vscode

5
star
45

VGG

VGG models from ILSVRC 2014
Python
4
star
46

pocket-ml-reference-korean

μ£Όλ¨Έλ‹ˆμ† λ¨Έμ‹ λŸ¬λ‹
Jupyter Notebook
4
star
47

fastai-course

CSS
3
star
48

hf-hub-utils

3
star
49

object-detection-test

object-detection-test
Jupyter Notebook
3
star
50

deploy-stable-diffusion-tfserving

This repo explores and demonstrates how to deploy stable diffusion model with TF Serving
3
star
51

llamaduo-spinoff

Jupyter Notebook
3
star
52

Sampling-Distribution-on-Poker-Cards-

2
star
53

Data-Wrangling-on-OpenStreeMap

Jupyter Notebook
2
star
54

llama-keras

Jupyter Notebook
2
star
55

book-tracking-react

Book tracking web-app project in React. This project is one of the requirements to graduate from 'Front End Web Development Nanodegree' @Udacity.
JavaScript
2
star
56

Baseball_Data_Analysis

Exploratory Data Visualization Project on Baseball Data in Tableau
2
star
57

Responsive-Portfolio

HTML
2
star
58

Data-Analysis-on-RedWine

HTML
2
star
59

read-paper-list

archive of read paper list
2
star
60

SD-TFTRT

Jupyter Notebook
2
star
61

rnn_simple

Python
2
star
62

Data-Analysis-on-Titanic

applying data analysis on titanic data sheet
Jupyter Notebook
2
star
63

neighborhood-map-react

neighborhood-map-react
JavaScript
2
star
64

ml-fn-impls

practice implementing functions appearing in machine learning field
Python
2
star
65

Logistic-Regression

simple neural network without hidden layer
Python
2
star
66

Enron-Data-Analysis

Data Analysis and Machine Learning on Enron Data
HTML
2
star
67

Linear-Regression

implement simple version of "Linear Regression" using only Numpy
Jupyter Notebook
2
star
68

tfx-gpu-docker

Dockerfile
1
star
69

YOLO-Impl-Tensorflow

Implementation of YOLO in Tensorflow
Python
1
star
70

calculator

1
star
71

deeplearningbook-korean-translation

experiments on translation of the book deeplearningbook
Jupyter Notebook
1
star
72

paper-code-match

matching between paper and its codes in side-by-side layout
HTML
1
star
73

deeplearning-with-structured-data

Jupyter Notebook
1
star
74

KaggleNotebook-Notes

Personal notes on some kaggle notebooks publicly available
1
star
75

gitmlops-test1

HTML
1
star
76

Python-Machine-Learning-Book-Practice

Python Machine Learning μ±…μ˜ μ†ŒμŠ€μ½”λ“œλ₯Ό μ£Όν”Όν„° λ…ΈνŠΈλΆμ΄ μ•„λ‹Œ, μ†ŒμŠ€μ½”λ“œ ν˜•νƒœλ‘œ μž‘μ„± μ—°μŠ΅
Python
1
star
77

test_img_clf

HTML
1
star
78

dstack-exp

Python
1
star
79

recall-mate

Python
1
star