• Stars
    star
    125
  • Rank 284,719 (Top 6 %)
  • Language
  • License
    MIT License
  • Created over 1 year ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Transfer the T2I-Adapter with any basemodel in diffusers🔥

T2I-Adapter-for-Diffusers

Transfer the T2I-Adapter with any basemodel in diffusers🔥

T2I-Adapter, a simple and small (~70M parameters, ~300M storage space) network that can provide extra guidance to pre-trained text-to-image models while freezing the original large text-to-image models. This repository provides the simplest tutorial code for using T2I-Adapter with diverse basemodel in the diffuser framework. It is very similar to ControlNet.

We have also supported Lora-for-Diffusers and ControlNet-for-Diffusers.

T2I-Adapter + Stable-Diffusion-1.5

As T2I-Adapter only trains adapter layers and keep all stable-diffusion models frozen, it is flexible to use any stable diffusion models as base. Here, I just use stable-diffusion-1.5 as an example.

Download adapter weights

mkdir models && cd models
wget https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_keypose_sd14v1.pth
wget https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_seg_sd14v1.pth
wget https://huggingface.co/TencentARC/T2I-Adapter/resolve/main/models/t2iadapter_sketch_sd14v1.pth
cd ..

Install packages

# please use this dev version of diffusers, as it has supported new pipeline
git clone https://github.com/HimariO/diffusers-t2i-adapter.git
git checkout general-adapter
cd diffusers-t2i-adapter

# manually change ./src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_adapter.py
# in StableDiffusionAdapterPipeline, comment out following lines
# adapter: Adapter,
# self.register_modules(adapter=adapter)

# then install from source
pip install .
cd ..

Load adapter weight

import torch
from typing import *
from diffusers.utils import load_image
from diffusers import StableDiffusionAdapterPipeline, Adapter

model_name = "runwayml/stable-diffusion-v1-5"
pipe = StableDiffusionAdapterPipeline.from_pretrained(model_name, torch_dtype=torch.float32).to('cuda')

adapter_ckpt = "./models/t2iadapter_seg_sd14v1.pth"
pipe.adapter = Adapter(cin=int(3*64), 
                       channels=[320, 640, 1280, 1280][:4], 
                       nums_rb=2, 
                       ksize=1, 
                       sk=True, 
                       use_conv=False)
pipe.adapter.load_state_dict(torch.load(adapter_ckpt))
pipe.adapter = pipe.adapter.to('cuda')

Inference

@torch.no_grad()
def get_color_masks(image: torch.Tensor) -> Dict[Tuple[int], torch.Tensor]:
    h, w, c = image.shape
    assert c == 3
    
    img_2d = image.view((-1, 3))
    colors, freqs = torch.unique(img_2d, return_counts=True, dim=0)
    colors = colors[freqs >= h]
    color2mask = {}
    for color in colors:
        mask = (image == color).float().max(dim=-1).values
        color = color.cpu().numpy().tolist()
        color2mask[tuple(color)] = mask
    return color2mask
    
mask = load_image("./diffusers-t2i-adapter/motor.png")

prompt = ["A black Honda motorcycle parked in front of a garage"]

image = pipe(prompt, [mask, mask]).images[0]
image.save('test.jpg')

You can get the results as below, input segmentation image (left), text-guided generated results (right).

If you want to use pose as input,

mask = load_image("./diffusers-t2i-adapter/pose.png")

prompt = ["A gril"]

# note the difference here!
image = pipe(prompt, [mask]).images[0]
image.save('result.jpg')

Please note that it is required to use correct pose format to make sure the generated results are satisfied. For the pre-trained T2I-Adapter (pose), you need to use COCO format pose. MMPose is recommendated. The following examples show the difference, OpenPose format (upper), COCO format (bottom)

T2I-Adapter + Stable-Diffusion-1.5 + Inpainting

Coming soon!

Acknowledgement

The diffusers pipeline is supported by HimariO, this repo is highly built on the top of it (fixed several typos) and works just as a handy tutorial.

Contact

The repo is still under active development, if you have any issue when using it, feel free to open an issue.

More Repositories

1

ControlNet-for-Diffusers

Transfer the ControlNet with any basemodel in diffusers🔥
Python
743
star
2

Lora-for-Diffusers

The most easy-to-understand tutorial for using LoRA (Low-Rank Adaptation) within diffusers framework for AI Generation Researchers🔥
Python
684
star
3

Score-CAM

Official implementation of Score-CAM in PyTorch
Python
379
star
4

inswapper

One-click Face Swapper and Restoration powered by insightface 🔥
Python
327
star
5

awesome-conditional-content-generation

Update-to-data resources for conditional content generation, including human motion generation, image or video generation and editing.
212
star
6

Awesome-Computer-Vision

Awesome Resources for Advanced Computer Vision Topics
209
star
7

video-swin-transformer-pytorch

Video Swin Transformer - PyTorch
Python
188
star
8

natural-language-joint-query-search

Search photos on Unsplash based on OpenAI's CLIP model, support search with joint image+text queries and attention visualization.
Jupyter Notebook
184
star
9

CLIFF

This repo equips the official CLIFF [ECCV 2022 Oral] with better detector, better tracker. Support multi-person, motion interpolation, motion smooth and SMPLify fitting.
Python
113
star
10

awesome-mlp-papers

Recent Advances in MLP-based Models (MLP is all you need!)
110
star
11

accurate-head-pose

Pytorch code for Hybrid Coarse-fine Classification for Head Pose Estimation
Python
97
star
12

Train-ControlNet-in-Diffusers

We show you how to train a ControlNet with your own control hint in diffusers framework
52
star
13

mxnet-Head-Pose

An MXNet implementation of Fine-Grained Head Pose
Python
47
star
14

cropimage

A simple toolkit for detecting and cropping main body from pictures. Support face and saliency detection.
Python
34
star
15

awesome-vision-language-modeling

Recent Advances in Vision-Language Pre-training!
25
star
16

visbeat3

Python3 Implementation for 'Visual Rhythm and Beat' SIGGRAPH 2018
Python
16
star
17

DWPose

Inference code for DWCode
Python
15
star
18

Multi-Frame-Rendering-in-Diffusers

7
star
19

stable-diffusion-xl-handbook

6
star
20

mmdet_benchmark

mmdetection、mmdeploy 中的 Mask R-CNN 深度优化
Python
5
star
21

Anime-Facial-Landmarks

Python
4
star
22

lora-block-weight-diffusers

When applying Lora, strength can be set block by block. Support for diffusers framework.
Python
3
star
23

mxnet-Hand-Detection

A simple headmap regression for hand detection
Python
2
star
24

CS188-Project

CS188 Project Fall 2017 Berkeley
Python
1
star
25

pytorch-distributed-training

A simple cookbook for DDP training in Pytorch
Python
1
star
26

KGRN-SR

Official Implementation for Knowledge Graph Routed Network for Situation Recognition [TPAMI'2023]
Python
1
star
27

SD3-diffusers

Stable Diffusion 3 in diffusers
1
star