• Stars
    star
    489
  • Rank 89,990 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 2 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Multimodal AI Story Teller, built with Stable Diffusion, GPT, and neural text-to-speech

StoryTeller

Code style: black License: MIT

A multimodal AI story teller, built with Stable Diffusion, GPT, and neural text-to-speech (TTS).

Given a prompt as an opening line of a story, GPT writes the rest of the plot; Stable Diffusion draws an image for each sentence; a TTS model narrates each line, resulting in a fully animated video of a short story, replete with audio and visuals.

out

Installation

PyPI

Story Teller is available on PyPI.

$ pip install storyteller-core

Source

  1. Clone the repository.
$ git clone https://github.com/jaketae/storyteller.git
$ cd storyteller
  1. Install dependencies.
$ pip install .

Note: For Apple M1/2 users, mecab-python3 is not available. You need to install mecab before running pip install. You can do this with Hombrew via brew install mecab. For more information, refer to SamuraiT/mecab-python3#84.

  1. (Optional) To develop locally, install dev dependencies and install pre-commit hooks. This will automatically trigger linting and code quality checks before each commit.
$ pip install -e .[dev]
$ pre-commit install

Quickstart

The quickest way to run a demo is through the CLI. Simply type

$ storyteller

The final video will be saved as /out/out.mp4, alongside other intermediate images, audio files, and subtitles.

To adjust the defaults with custom parametes, toggle the CLI flags as needed.

$ storyteller --help
usage: storyteller [-h] [--writer_prompt WRITER_PROMPT]
                   [--painter_prompt_prefix PAINTER_PROMPT_PREFIX] [--num_images NUM_IMAGES]
                   [--output_dir OUTPUT_DIR] [--seed SEED] [--max_new_tokens MAX_NEW_TOKENS]
                   [--writer WRITER] [--painter PAINTER] [--speaker SPEAKER]
                   [--writer_device WRITER_DEVICE] [--painter_device PAINTER_DEVICE]

optional arguments:
  -h, --help            show this help message and exit
  --writer_prompt WRITER_PROMPT
  --painter_prompt_prefix PAINTER_PROMPT_PREFIX
  --num_images NUM_IMAGES
  --output_dir OUTPUT_DIR
  --seed SEED
  --max_new_tokens MAX_NEW_TOKENS
  --writer WRITER
  --painter PAINTER
  --speaker SPEAKER
  --writer_device WRITER_DEVICE
  --painter_device PAINTER_DEVICE

Usage

For more advanced use cases, you can also directly interface with Story Teller in Python code.

  1. Load the model with defaults.
from storyteller import StoryTeller

story_teller = StoryTeller.from_default()
story_teller.generate(...)
  1. Alternatively, configure the model with custom settings.
from storyteller import StoryTeller, StoryTellerConfig

config = StoryTellerConfig(
    writer="gpt2-large",
    painter="CompVis/stable-diffusion-v1-4",
    max_new_tokens=100,
)

story_teller = StoryTeller(config)
story_teller.generate(...)

License

Released under the MIT License.

More Repositories

1

koclip

KoCLIP: Korean port of OpenAI CLIP, in Flax
Python
139
star
2

deep-malware-detection

A neural approach to malware detection in portable executables
Python
71
star
3

wordwise

N-gram keyword extraction using spaCy and pretrained language models
Python
62
star
4

ensemble-transformers

Ensembling Hugging Face transformers made easy
Python
59
star
5

g-mlp

PyTorch implementation of Pay Attention to MLPs
Python
39
star
6

alibi

PyTorch implementation of Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation
Python
25
star
7

param-share-transformer

PyTorch implementation of Lessons on Parameter Sharing across Layers in Transformers
Python
25
star
8

fnet

PyTorch implementation of FNet: Mixing Tokens with Fourier transforms
Python
25
star
9

mlp-mixer

PyTorch implementation of MLP-Mixer: An all-MLP Architecture for Vision
Python
20
star
10

conformer

PyTorch implementation of Conformer: Convolution-augmented Transformer for Speech Recognition
Python
13
star
11

realformer

PyTorch implementation of RealFormer: Transformer Likes Residual Attention
Python
11
star
12

jaketae.github.io

Personal blog with Jupyter notebooks
Jupyter Notebook
11
star
13

vit-breast-cancer

Transfer learning pretrained vision transformers for breast histopathology
Python
9
star
14

lm-identifier

A toolkit for identifying pretrained language models from potentially AI-generated text
Python
9
star
15

tupe

PyTorch implementation of Rethinking Positional Encoding in Language Pre-training
Python
7
star
16

fastapi-bert

Fine-tuning and deploying BERT through FastAPI
Jupyter Notebook
6
star
17

ml-from-scratch

Machine learning algorithms implemented from scratch with NumPy
Python
4
star
18

pygrad

Pure Python autograd library based on NumPy
Python
4
star
19

auto-tagger

Fine-tuning and zero-shot learning with transformers to automatically tag my study blog posts
Python
4
star
20

res-mlp

PyTorch implementation of ResMLP: Feedforward networks for image classification with data-efficient training
Python
3
star
21

graph-neural-ode

Graph neural ordinary differential equations
Python
3
star
22

image-classifier

Image classifier web application based on MobileNet, built using Flask, TensorFlow, and Matplotlib
HTML
2
star
23

dotfiles

Minimalist dotfiles and personal preferences
Shell
1
star
24

docker-django

Python
1
star