• Stars
    star
    129
  • Rank 279,262 (Top 6 %)
  • Language
    Jupyter Notebook
  • License
    Creative Commons ...
  • Created almost 2 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

what I learned about fine-tuning stable diffusion

About

Code for my tutorial What I Learned About Fine-tuning Stable Diffusion.

I copied the training scripts from the following repos and will periodically update them to the latest:

Setup

  • Python version: tested with 3.9.11 and 3.10.9 (3.8.x may run into this error)
  • Pytorch version: tested with latest 1.13.1+cu117 (used 1.11.0 for my old 2080ti by running pip install torch==1.11.0)
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
accelerate config default

Optional: install xformers and add --enable_xformers_memory_efficient_attention

pip install xformers
  • login to HuggingFace using your token: huggingface-cli login
  • login to WandB using your API key: wandb login. If you won't want to use WandB, remove --report_to=wandb from all commands below.
  • you may need to do export WANDB_DISABLE_SERVICE=true to solve this issue
  • If you have multiple GPU, you can set the following environment variable to choose which GPU to use (default is CUDA_VISIBLE_DEVICES=0): export CUDA_VISIBLE_DEVICES=1
  • FileNotFoundError: [Errno 2] No such file or directory: 'git-lfs': sudo apt install git-lfs

Full SD Fine-tuning with LoRA

see docs

  • Pokemon dataset (took ~7.5 hours on 2080ti and ~4.5 hours on Tesla V100)
export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export DATASET_NAME="lambdalabs/pokemon-blip-captions"
export OUTPUT_DIR="./models/lora/pokemon"

accelerate launch --mixed_precision="fp16"  train_text_to_image_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --dataset_name=$DATASET_NAME \
  --dataloader_num_workers=8 \
  --resolution=512 --center_crop --random_flip \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --max_train_steps=15000 \
  --learning_rate=1e-04 \
  --max_grad_norm=1 \
  --lr_scheduler="cosine" --lr_warmup_steps=0 \
  --output_dir=${OUTPUT_DIR} \
  --checkpointing_steps=500 \
  --validation_prompt="Totoro" \
  --seed=42 \
  --report_to=wandb
  • Custom dataset, i.e., toy example using 15 photos of my cat Miles (took ~40 minutes on Tesla V100):
export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export DATA_DIR="./data/full-finetune/cat"
export OUTPUT_DIR="./models/lora/miles"

accelerate launch --mixed_precision="fp16"  train_text_to_image_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --train_data_dir=$DATA_DIR \
  --dataloader_num_workers=8 \
  --resolution=512 --center_crop --random_flip \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --max_train_steps=1500 \
  --learning_rate=1e-04 \
  --max_grad_norm=1 \
  --lr_scheduler="cosine" --lr_warmup_steps=0 \
  --output_dir=${OUTPUT_DIR} \
  --checkpointing_steps=500 \
  --validation_prompt="A photo of a cat in a bucket" \
  --validation_epochs=10 \
  --seed=42 \
  --report_to=wandb

Dreambooth with LoRA

Fine-tune using Dreambooth with LoRA and your own dataset (4 min 39 sec. V100).

Dog example (data from the paper):

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/dog"
export OUTPUT_DIR="./models/dreambooth-lora/dog"

accelerate launch train_dreambooth_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of sks dog" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --checkpointing_steps=100 \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --validation_prompt="A photo of sks dog in a bucket" \
  --validation_epochs=20 \
  --seed=42 \
  --report_to="wandb"

TODO: Dog example with xformer:

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/dog"
export OUTPUT_DIR="./models/dreambooth-lora/dog"

accelerate launch train_dreambooth_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of sks dog" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --checkpointing_steps=100 \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --validation_prompt="A photo of sks dog in a bucket" \
  --validation_epochs=20 \
  --enable_xformers_memory_efficient_attention \
  --seed=42 \
  --report_to="wandb"

Sunglasses example:

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/glasses"
export OUTPUT_DIR="./models/dreambooth-lora/sunglasses"

accelerate launch train_dreambooth_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of sks sunglasses" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --checkpointing_steps=100 \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=500 \
  --validation_prompt="A photo of sks sunglasses with eiffel tower" \
  --validation_epochs=20 \
  --seed=42 \
  --report_to="wandb"

My cat Miles example:

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/cat"
export OUTPUT_DIR="./models/dreambooth-lora/miles"

accelerate launch train_dreambooth_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of sks cat" \
  --resolution=512 --center_crop \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --checkpointing_steps=100 \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=1500 \
  --validation_prompt="A photo of a sks cat in a bucket" \
  --validation_epochs=10 \
  --seed=42 \
  --report_to="wandb"

With class prompt (class images generated by the model) and prior preservation (with weight 0.5):

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/cat"
export CLASS_DIR="./data/dreambooth/cat-class"
export OUTPUT_DIR="./models/dreambooth-lora/miles"

accelerate launch train_dreambooth_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of sks cat" \
  --class_prompt="a photo of a cat" \
  --with_prior_preservation --prior_loss_weight=0.5 \
  --resolution=512 \
  --train_batch_size=2 \
  --gradient_accumulation_steps=1 \
  --checkpointing_steps=100 \
  --learning_rate=1e-4 \
  --report_to="wandb" \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=1500 \
  --validation_prompt="A photo of sks cat in a bucket" \
  --num_class_images=200 \
  --validation_epochs=10 \
  --seed=42

Miss Dong example:

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/missdong"
export OUTPUT_DIR="./models/dreambooth-lora/missdong"

accelerate launch train_dreambooth_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of sks lady" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --checkpointing_steps=100 \
  --learning_rate=1e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=1200 \
  --validation_prompt="oil painting of sks lady by the ocean" \
  --validation_epochs=20 \
  --seed=42 \
  --report_to="wandb"

Another example:

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/david-beckham"
export OUTPUT_DIR="./models/dreambooth-lora/david-beckham"

accelerate launch train_dreambooth_lora.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of dbsks man" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --checkpointing_steps=100 \
  --learning_rate=1e-4 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=700 \
  --validation_prompt="A photo of dbsks man, detailed faces, highres, RAW photo 8k uhd, dslr" \
  --validation_epochs=10 \
  --seed=42 \
  --report_to="wandb"

--with_prior_preservation --prior_loss_weight=1.0 \

generate images using LoRA weights:

python generate-lora.py --prompt "a dog standing on the great wall" --model_path "./models/dreambooth-lora/dog" --output_folder "./outputs" --steps 50
python generate-lora.py --prompt "a sks dog standing on the great wall" --model_path "./models/dreambooth-lora/dog" --output_folder "./outputs"

Dreambooth

See blog and docs

Dog example without prior-preservation loss (~7 mins on V100):

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/dog"
export OUTPUT_DIR="./models/dreambooth/dog"

accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of sks dog" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=400 \
  --report_to="wandb"

Miss Dong example:

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/missdong"
export OUTPUT_DIR="./models/dreambooth/missdong"

accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --instance_prompt="a photo of sks lady" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --max_train_steps=400 \
  --report_to="wandb"

generate images using Dreambooth models:

python generate-dreambooth.py --prompt "a dog standing on the great wall" --model_path "./models/dreambooth/dog" --output_folder "./outputs" --steps 50
python generate-dreambooth.py --prompt "a sks dog standing on the great wall" --model_path "./models/dreambooth/dog" --output_folder "./outputs/dreambooth"
python generate-dreambooth.py --prompt "a sks dog swimming"

Fine-tuning Stable diffusion with LoRA PTI

Use this repo: pip install git+https://github.com/cloneofsimo/lora.git

Use LoRA PTI

Try a 3d avatar style

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export INSTANCE_DIR="./data/dreambooth/3d-avatar"
export OUTPUT_DIR="./models/dreambooth/3d-avatar"

lora_pti \
  --pretrained_model_name_or_path=$MODEL_NAME  \
  --instance_data_dir=$INSTANCE_DIR \
  --output_dir=$OUTPUT_DIR \
  --train_text_encoder \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=4 \
  --scale_lr \
  --learning_rate_unet=1e-4 \
  --learning_rate_text=1e-5 \
  --learning_rate_ti=5e-4 \
  --color_jitter \
  --lr_scheduler="linear" \
  --lr_warmup_steps=0 \
  --placeholder_tokens="<s1>|<s2>" \
  --use_template="style"\
  --save_steps=100 \
  --max_train_steps_ti=1000 \
  --max_train_steps_tuning=1000 \
  --perform_inversion=True \
  --clip_ti_decay \
  --weight_decay_ti=0.000 \
  --weight_decay_lora=0.001\
  --continue_inversion \
  --continue_inversion_lr=1e-4 \
  --device="cuda:0" \
  --lora_rank=1 \
  --use_template="style" \
#  --use_face_segmentation_condition\

To use the trained LoRA weights in WebUI, you need to merge it with a base model:

lora_add runwayml/stable-diffusion-v1-5 ./models/dreambooth/3d-avatar/final_lora.safetensors ./output_merged.ckpt 0.7 --mode upl-ckpt-v2

Convert Diffusers LoRA Weights for Automatic1111 WebUI

The LoRA weights trained using Diffusers are saved in .bin or .pkl format, which must be converted to be used in Automatic1111 WebUI (see here for detailed discussions).

As seen below, the trained LoRA weights are stored in custom_checkpoint_0.pkl or pytorch_model.bin:

convert-to-safetensors.py can be used to convert .bin or .pkl files into .safetensors format, which can be used in WebUI (just put the converted the file in WebUI models/Lora). The script is adapted from the one written by ignacfetser.

Put this script in the same folder of .bin or .pkl file and run python convert-to-safetensors.py --file checkpoint_file

More Repositories

1

scrapy-tutorial

A Minimalist End-to-End Scrapy Tutorial
Python
66
star
2

pyprom

Process Mining (ProM) in Python
Python
38
star
3

tailpages

A Github Pages (Jekyll) template based on TailwindCSS
CSS
35
star
4

scrapy-selenium-demo

a demo of scrapy + selenium
Python
20
star
5

fred

FRED (Flask + REact + Docker): An End-to-End Boilerplate for Full Stack Development
JavaScript
20
star
6

chatgpt-prompt-course

code for https://www.deeplearning.ai/short-courses/chatgpt-prompt-engineering-for-developers/
Jupyter Notebook
15
star
7

CalligraphyGAN

Abstract Art Generation via CalligraphyGAN
Python
13
star
8

tutorial-buffet

A curated set of AI and Data Science tutorials in Python - fully revised by me and ready to run!
Jupyter Notebook
11
star
9

image-tagging

image tagging using pre-trained models
Python
7
star
10

redirect-headless-wp

"Empty" Wordpress Theme to redirect headless Wordpress Frontend
CSS
5
star
11

mini-ml

A Minimalist End-to-End Machine Learning Tutorial
Jupyter Notebook
5
star
12

python-generative-art

Generative Art via Python
Jupyter Notebook
5
star
13

scrapy-tutorial-starter

starter project for my scrapy tutorial
5
star
14

learning-analytics

Learning analytics project using Canvas data
Python
3
star
15

streamlit-basics

Streamlit Basics via Titanic Dataset
Jupyter Notebook
3
star
16

pymining-book

code and notes for reading "learning data mining with Python"
Python
3
star
17

harrywang.github.io

My Personal Website
HTML
2
star
18

iching-book

I Ching Full Text by Richard Wilhelm
2
star
19

ai-tutorials

Jupyter Notebook
2
star
20

clip-tasks

demo tasks using CLIP
Jupyter Notebook
1
star
21

songbase

a simple flask project
HTML
1
star
22

pub-analyzer

code for publication analysis
Jupyter Notebook
1
star
23

misy331

Course Website for MISY331 Machine Learning for Business
JavaScript
1
star
24

rag-langchain-qdrant

Jupyter Notebook
1
star
25

token-cost-calculator

A Gradio App for Calculating OpenAI API Tokens and Costs
Python
1
star
26

langchain-short-course

LangChain: Chat with Your Data
Jupyter Notebook
1
star
27

bigdata-cookbook

A simple cookbook for installing and configuring a few systems for big data analytics
Shell
1
star
28

blockchain-hackernoon

My revised code for https://hackernoon.com/learn-blockchains-by-building-one-117428612f46
Python
1
star
29

flasky-react

React version of Flasky (https://github.com/harrywang/flasky)
JavaScript
1
star
30

my-flask-react-auth

code for course https://testdriven.io/courses/auth-flask-react/getting-started/
Python
1
star
31

chatbot-kickstarter

My revised chatbot sample app from OpenAI
Jupyter Notebook
1
star
32

misdao

MISDAO.org - the first DAO for the Information Systems (IS) community
1
star
33

react-flask

code for tutorial: https://blog.miguelgrinberg.com/post/how-to-create-a-react--flask-project
JavaScript
1
star
34

house-price-prediction

A 10-Step Machine Learning Project Workflow Demonstration
Jupyter Notebook
1
star