• Stars
    star
    4,404
  • Rank 9,717 (Top 0.2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 2 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Create πŸ”₯ videos with Stable Diffusion by exploring the latent space and morphing between text prompts

stable-diffusion-videos

Try it yourself in Colab: Open In Colab

TPU version (~x6 faster than standard colab GPUs): Open In Colab

Example - morphing between "blueberry spaghetti" and "strawberry spaghetti"

berry_good_spaghetti.2.mp4

Installation

pip install stable_diffusion_videos

Usage

Check out the examples folder for example scripts πŸ‘€

Making Videos

Note: For Apple M1 architecture, use torch.float32 instead, as torch.float16 is not available on MPS.

from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")

video_path = pipeline.walk(
    prompts=['a cat', 'a dog'],
    seeds=[42, 1337],
    num_interpolation_steps=3,
    height=512,  # use multiples of 64 if > 512. Multiples of 8 if < 512.
    width=512,   # use multiples of 64 if > 512. Multiples of 8 if < 512.
    output_dir='dreams',        # Where images/videos will be saved
    name='animals_test',        # Subdirectory of output_dir where images/videos will be saved
    guidance_scale=8.5,         # Higher adheres to prompt more, lower lets model take the wheel
    num_inference_steps=50,     # Number of diffusion steps per image generated. 50 is good default
)

Making Music Videos

New! Music can be added to the video by providing a path to an audio file. The audio will inform the rate of interpolation so the videos move to the beat 🎢

from stable_diffusion_videos import StableDiffusionWalkPipeline
import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")

# Seconds in the song.
audio_offsets = [146, 148]  # [Start, end]
fps = 30  # Use lower values for testing (5 or 10), higher values for better quality (30 or 60)

# Convert seconds to frames
num_interpolation_steps = [(b-a) * fps for a, b in zip(audio_offsets, audio_offsets[1:])]

video_path = pipeline.walk(
    prompts=['a cat', 'a dog'],
    seeds=[42, 1337],
    num_interpolation_steps=num_interpolation_steps,
    audio_filepath='audio.mp3',
    audio_start_sec=audio_offsets[0],
    fps=fps,
    height=512,  # use multiples of 64 if > 512. Multiples of 8 if < 512.
    width=512,   # use multiples of 64 if > 512. Multiples of 8 if < 512.
    output_dir='dreams',        # Where images/videos will be saved
    guidance_scale=7.5,         # Higher adheres to prompt more, lower lets model take the wheel
    num_inference_steps=50,     # Number of diffusion steps per image generated. 50 is good default
)

Using the UI

from stable_diffusion_videos import StableDiffusionWalkPipeline, Interface
import torch

pipeline = StableDiffusionWalkPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
).to("cuda")

interface = Interface(pipeline)
interface.launch()

Credits

This work built off of a script shared by @karpathy. The script was modified to this gist, which was then updated/modified to this repo.

Contributing

You can file any issues/feature requests here

Enjoy πŸ€—

Extras

Upsample with Real-ESRGAN

You can also 4x upsample your images with Real-ESRGAN!

It's included when you pip install the latest version of stable-diffusion-videos!

You'll be able to use upsample=True in the walk function, like this:

pipeline.walk(['a cat', 'a dog'], [234, 345], upsample=True)

The above may cause you to run out of VRAM. No problem, you can do upsampling separately.

To upsample an individual image:

from stable_diffusion_videos import RealESRGANModel

model = RealESRGANModel.from_pretrained('nateraw/real-esrgan')
enhanced_image = model('your_file.jpg')

Or, to do a whole folder:

from stable_diffusion_videos import RealESRGANModel

model = RealESRGANModel.from_pretrained('nateraw/real-esrgan')
model.upsample_imagefolder('path/to/images/', 'path/to/output_dir')

More Repositories

1

huggingpics

πŸ€—πŸ–ΌοΈ HuggingPics: Fine-tune Vision Transformers for anything using images found on the web.
Jupyter Notebook
275
star
2

Lda2vec-Tensorflow

Tensorflow 1.5 implementation of Chris Moody's Lda2vec, adapted from @meereeum
Python
107
star
3

download-musiccaps-dataset

Download the MusicCaps dataset for music captioning
Jupyter Notebook
96
star
4

singing-songstarter

Sing an idea ➑️ AI music sampleπŸ”₯🎢
Python
86
star
5

replicate-examples

Python
74
star
6

huggingface-sync-action

GitHub action that'll sync files from a GitHub Repo with the Hugging Face Hub πŸ€—
Python
64
star
7

openai-vision-api-for-videos

Extract information, summarize, ask questions, and search videos using OpenAI's Vision API πŸš€πŸŽ¦
Jupyter Notebook
61
star
8

huggingface-datasets-converter

Scripts to convert datasets from various sources to Hugging Face Datasets.
Python
57
star
9

animegan-v2-for-videos

Apply AnimeGAN-v2 across frames of a video clip
Jupyter Notebook
42
star
10

hf-hub-lightning

A PyTorch Lightning Callback for pushing models to the Hugging Face Hub πŸ€—βš‘οΈ
Python
35
star
11

spaces-docker-templates

πŸš€πŸ€— A collection of templates for Hugging Face Spaces
Dockerfile
35
star
12

huggingface-hub-examples

Examples using πŸ€— Hub to share and reload machine learning models
Jupyter Notebook
33
star
13

roast-or-toast-bot

A fun (yet toxic) twitter bot that uses GPT-3 to either roast 😈 or toast πŸ₯‚ a tweet if you mention it in the replies
Python
30
star
14

voice-cloning

Make Kanye sing any song ya want 🎀πŸ”₯
Jupyter Notebook
23
star
15

huggingface-vit-finetune

Finetune Google's pre-trained ViT models from HuggingFace's model hub.
Python
18
star
16

modelcards

πŸ“ Utility to create, edit, and publish model cards on the Hugging Face Hub. [**Now lives in huggingface_hub**]
Jupyter Notebook
15
star
17

Tensorflow-for-NLP

These are the files from my Tensorflow for NLP playlist on YouTube
Python
15
star
18

encoded-video

Utilities for working with videos
Python
13
star
19

lambdacloud

An unofficial Python client library for Lambda Lab's Cloud Computing Platform
Python
13
star
20

hf-text-classification

Python
12
star
21

modal-examples

Apps that run on modal.com
Python
12
star
22

spaces-template

A πŸ”₯ cookiecutter template for building Hugging Face Spaces
Shell
11
star
23

azureml-examples

AzureML is fun! 🍻
Python
8
star
24

aiart-blog

Jupyter Notebook
7
star
25

pytorch-lightning-azureml

Narrow the gap between research and production 😎
Python
6
star
26

host-a-blog-on-huggingface-spaces

How to host a blog on πŸ€—
Python
6
star
27

my-huggingface-repos

A command center for multiple Hugging Face repos. Files are synced with the Hub.
Python
6
star
28

tabular-anomaly-detection

Python
5
star
29

azureml-pipelines

Example pipelines using AzureML SDKv1. πŸ‘·β€β™€οΈ WIP
Python
5
star
30

discord-image-captioning-bot

A Discord bot for captioning images
Python
5
star
31

lightning-vision-transformer

πŸ–Ό + πŸ€– = 🧠
Python
5
star
32

background-remover

πŸ–Ό ️A Gradio app to remove the background from an image
Python
5
star
33

huggingpics-explorer

A streamlit app for exploring image search results from HuggingPics
Python
4
star
34

quickdraw-pytorch

Train a simple CNN on the "Quick, Draw!" dataset using Google Colab
Jupyter Notebook
4
star
35

pytorchvideo-classification

A first look at PyTorch for Video Classification
Python
4
star
36

huggingface-detr-finetune

Python
3
star
37

lightning-pretrain-hf

Python
3
star
38

huggingface-image-datasets

Learn how to share Image datasets on Huggingface's Hub.
Python
3
star
39

spotify-pedalboard-demo

🚧 WIP Streamlit Demo of Spotify's Pedalboard 🚧
Python
3
star
40

lita-colab

Colab Notebook for Nvidia's LITA: Language Instructed Temporal-Localization Assistant
Jupyter Notebook
3
star
41

naterawdotcom

My personal website/blog, made with Quarto
Jupyter Notebook
3
star
42

test-spaces-app

a dummy hugging face spaces app for testing
Python
2
star
43

helpful-snippets

An interactive app with some snippets I've found helpful
Python
2
star
44

pytorchvideo-accelerate

Distributed training of video action recognition models with pytorchvideo and Hugging Face accelerate
Python
2
star
45

azure-web-app-test

2
star
46

lightning-cats-and-dogs

Python
2
star
47

spaces-lfs-workflow

Workflow that syncs code from GitHub and stores LFS files on HF
Python
2
star
48

map-vs-generator-issue

dump of some files
Python
2
star
49

image-generation

Python
2
star
50

auto-anything

Playing with ideas to include/reference code on Huggingface's hub. Experimental!
Python
2
star
51

BeautifulSauce

BeautifulSoup's saucy sibling!
Jupyter Notebook
2
star
52

applied-ml-examples

Temporary repo to put some applied ML examples
Jupyter Notebook
2
star
53

test-colab-pr-action

Jupyter Notebook
2
star
54

fastpages-blog

Trying out fastpages
Jupyter Notebook
2
star
55

colab-pr-action

Python
1
star
56

test_doc_builder

playground for me to figure out hugging face doc builder + related github actions
1
star
57

github-action-playground

Dummy repo to play with github actions. Ignore me :)
1
star
58

gradio-guides

Python
1
star
59

test-space-lfs

Python
1
star
60

azure-devops-flask

A simple template for deploying Flask apps with CI/CD to Azure DevOps
Python
1
star
61

speech-to-code

When mom says we have an OpenAI Codex at home
Jupyter Notebook
1
star
62

cats_vs_dogs

Python
1
star
63

pytorch-lightning-examples

Place for me to keep my personal Pytorch Lightning examples/notebooks
1
star
64

Resume

An overly complicated way to write your resume
HTML
1
star
65

vision-datasets-viewer

Python
1
star
66

nateraw

1
star
67

vsc2022-dataset-visualizer

Simple streamlit app to explore DrivenData's vsc2022 competition dataset
Python
1
star
68

vision

Jupyter Notebook
1
star