• Stars
    star
    4,936
  • Rank 8,118 (Top 0.2 %)
  • Language
    Python
  • License
    Other
  • Created almost 7 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn

dank text

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly train on a text using a pretrained model.

textgenrnn is a Python 3 module on top of Keras/TensorFlow for creating char-rnns, with many cool features:

  • A modern neural network architecture which utilizes new techniques as attention-weighting and skip-embedding to accelerate training and improve model quality.
  • Train on and generate text at either the character-level or word-level.
  • Configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs.
  • Train on any generic input text file, including large files.
  • Train models on a GPU and then use them to generate text with a CPU.
  • Utilize a powerful CuDNN implementation of RNNs when trained on the GPU, which massively speeds up training time as opposed to typical LSTM implementations.
  • Train the model using contextual labels, allowing it to learn faster and produce better results in some cases.

You can play with textgenrnn and train any text file with a GPU for free in this Colaboratory Notebook! Read this blog post or watch this video for more information!

Examples

from textgenrnn import textgenrnn

textgen = textgenrnn()
textgen.generate()
[Spoiler] Anyone else find this post and their person that was a little more than I really like the Star Wars in the fire or health and posting a personal house of the 2016 Letter for the game in a report of my backyard.

The included model can easily be trained on new texts, and can generate appropriate text even after a single pass of the input data.

textgen.train_from_file('hacker_news_2000.txt', num_epochs=1)
textgen.generate()
Project State Project Firefox

The model weights are relatively small (2 MB on disk), and they can easily be saved and loaded into a new textgenrnn instance. As a result, you can play with models which have been trained on hundreds of passes through the data. (in fact, textgenrnn learns so well that you have to increase the temperature significantly for creative output!)

textgen_2 = textgenrnn('/weights/hacker_news.hdf5')
textgen_2.generate(3, temperature=1.0)
Why we got money “regular alter”

Urburg to Firefox acquires Nelf Multi Shamn

Kubernetes by Google’s Bern

You can also train a new model, with support for word level embeddings and bidirectional RNN layers by adding new_model=True to any train function.

Interactive Mode

It's also possible to get involved in how the output unfolds, step by step. Interactive mode will suggest you the top N options for the next char/word, and allows you to pick one.

When running textgenrnn in the terminal, pass interactive=True and top=N to generate. N defaults to 3.

from textgenrnn import textgenrnn

textgen = textgenrnn()
textgen.generate(interactive=True, top_n=5)

word_level_demo

This can add a human touch to the output; it feels like you're the writer! (reference)

Usage

textgenrnn can be installed from pypi via pip:

pip3 install textgenrnn

For the latest textgenrnn, you must have a minimum TensorFlow version of 2.1.0.

You can view a demo of common features and model configuration options in this Jupyter Notebook.

/datasets contains example datasets using Hacker News/Reddit data for training textgenrnn.

/weights contains further-pretrained models on the aforementioned datasets which can be loaded into textgenrnn.

/outputs contains examples of text generated from the above pretrained models.

Neural Network Architecture and Implementation

textgenrnn is based off of the char-rnn project by Andrej Karpathy with a few modern optimizations, such as the ability to work with very small text sequences.

default model

The included pretrained-model follows a neural network architecture inspired by DeepMoji. For the default model, textgenrnn takes in an input of up to 40 characters, converts each character to a 100-D character embedding vector, and feeds those into a 128-cell long-short-term-memory (LSTM) recurrent layer. Those outputs are then fed into another 128-cell LSTM. All three layers are then fed into an Attention layer to weight the most important temporal features and average them together (and since the embeddings + 1st LSTM are skip-connected into the attention layer, the model updates can backpropagate to them more easily and prevent vanishing gradients). That output is mapped to probabilities for up to 394 different characters that they are the next character in the sequence, including uppercase characters, lowercase, punctuation, and emoji. (if training a new model on a new dataset, all of the numeric parameters above can be configured)

context model

Alternatively, if context labels are provided with each text document, the model can be trained in a contextual mode, where the model learns the text given the context so the recurrent layers learn the decontextualized language. The text-only path can piggy-back off the decontextualized layers; in all, this results in much faster training and better quantitative and qualitative model performance than just training the model gien the text alone.

The model weights included with the package are trained on hundreds of thousands of text documents from Reddit submissions (via BigQuery), from a very diverse variety of subreddits. The network was also trained using the decontextual approach noted above in order to both improve training performance and mitigate authorial bias.

When fine-tuning the model on a new dataset of texts using textgenrnn, all layers are retrained. However, since the original pretrained network has a much more robust "knowledge" initially, the new textgenrnn trains faster and more accurately in the end, and can potentially learn new relationships not present in the original dataset (e.g. the pretrained character embeddings include the context for the character for all possible types of modern internet grammar).

Additionally, the retraining is done with a momentum-based optimizer and a linearly decaying learning rate, both of which prevent exploding gradients and makes it much less likely that the model diverges after training for a long time.

Notes

  • You will not get quality generated text 100% of the time, even with a heavily-trained neural network. That's the primary reason viral blog posts/Twitter tweets utilizing NN text generation often generate lots of texts and curate/edit the best ones afterward.

  • Results will vary greatly between datasets. Because the pretrained neural network is relatively small, it cannot store as much data as RNNs typically flaunted in blog posts. For best results, use a dataset with at least 2,000-5,000 documents. If a dataset is smaller, you'll need to train it for longer by setting num_epochs higher when calling a training method and/or training a new model from scratch. Even then, there is currently no good heuristic for determining a "good" model.

  • A GPU is not required to retrain textgenrnn, but it will take much longer to train on a CPU. If you do use a GPU, I recommend increasing the batch_size parameter for better hardware utilization.

Future Plans for textgenrnn

  • More formal documentation

  • A web-based implementation using tensorflow.js (works especially well due to the network's small size)

  • A way to visualize the attention-layer outputs to see how the network "learns."

  • A mode to allow the model architecture to be used for chatbot conversations (may be released as a separate project)

  • More depth toward context (positional context + allowing multiple context labels)

  • A larger pretrained network which can accommodate longer character sequences and a more indepth understanding of language, creating better generated sentences.

  • Hierarchical softmax activation for word-level models (once Keras has good support for it).

  • FP16 for superfast training on Volta/TPUs (once Keras has good support for it).

Articles/Projects using textgenrnn

Articles

Projects

Tweets

Maintainer/Creator

Max Woolf (@minimaxir)

Max's open-source projects are supported by his Patreon. If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use.

Credits

Andrej Karpathy for the original proposal of the char-rnn via the blog post The Unreasonable Effectiveness of Recurrent Neural Networks.

Daniel Grijalva for contributing an interactive mode.

License

MIT

Attention-layer code used from DeepMoji (MIT Licensed)

More Repositories

1

big-list-of-naughty-strings

The Big List of Naughty Strings is a list of strings which have a high probability of causing issues when used as user-input data.
Python
45,765
star
2

hacker-news-undocumented

Some of the hidden norms about Hacker News not otherwise covered in the Guidelines and the FAQ.
3,541
star
3

gpt-2-simple

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
Python
3,367
star
4

simpleaichat

Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
Python
3,329
star
5

facebook-page-post-scraper

Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis
Python
2,100
star
6

person-blocker

Automatically "block" people in images (like Black Mirror) using a pretrained neural network.
Python
2,024
star
7

automl-gs

Provide an input CSV and a target field to predict, generate a model + code to run it.
Python
1,836
star
8

aitextgen

A robust Python tool for text-based AI training and generation using GPT-2.
Python
1,827
star
9

stylecloud

Python package + CLI to generate stylistic wordclouds, including gradients and icon shapes!
Python
817
star
10

gpt-3-experiments

Test prompts for OpenAI's GPT-3 API and the resulting AI-generated texts.
Python
710
star
11

video-to-gif-osx

A set of utilities that allow the user to easily convert video files to very-high-quality GIFs on OS X.
Shell
396
star
12

copy-syntax-highlight-osx

Copy Syntax Highlight for OS X is an OS X service which copies the selected text to the clipboard, with proper syntax highlighting for the given language.
380
star
13

gpt-2-cloud-run

Text-generation API via GPT-2 for Cloud Run
HTML
313
star
14

reactionrnn

Python module + R package to predict the reactions to a given text using a pretrained recurrent neural network.
Python
298
star
15

download-tweets-ai-text-gen

Python script to download public Tweets from a given Twitter account into a format suitable for AI text generation.
Python
219
star
16

tweet-generator

Train a neural network optimized for generating tweets based off of any number of Twitter users.
Python
218
star
17

char-embeddings

A repository containing 300D character embeddings derived from the GloVe 840B/300D dataset, and uses these embeddings to train a deep learning model to generate Magic: The Gathering cards using Keras
Python
214
star
18

magic-the-gifening

A Twitter bot which tweets Magic: the Gathering cards with appropriate GIFs superimposed onto them.
Python
214
star
19

system-dashboard

Minimalist Win/OSX/Linux System Dashboard using Flask and Freeboard
HTML
199
star
20

imgmaker

Create high-quality images programmatically with easily-hackable templates.
Python
168
star
21

ctrl-gce

Set up the CTRL text-generating model on Google Compute Engine with just a few console commands.
Shell
154
star
22

ai-generated-pokemon-rudalle

Python script to preprocess images of all Pokémon to finetune ruDALL-E
Python
140
star
23

imgbeddings

Python package to generate image embeddings with CLIP without PyTorch/TensorFlow
Python
122
star
24

mtg-gpt-2-cloud-run

Code and UI for running a Magic card text generator API via GPT-2
HTML
119
star
25

get-all-hacker-news-submissions-comments

Simple Python scripts to download all Hacker News submissions and comments and store them in a PostgreSQL database.
Python
116
star
26

hacker-news-gpt-2

Dump of generated texts from GPT-2 trained on Hacker News titles
114
star
27

reddit-bigquery

Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily
R
110
star
28

facebook-ad-library-scraper

A Python scraper for the Facebook Ad Library, using the official Facebook Ad Library API.
Python
108
star
29

optillusion-animation

Python code to submit rotated images to the Cloud Vision API + R code for visualizing it
Python
99
star
30

chatgpt_api_test

Demos utilizing the ChatGPT API
Jupyter Notebook
95
star
31

gpt-3-client

A client for OpenAI's GPT-3 API for ad hoc testing of prompt without using the web interface.
Python
90
star
32

stable-diffusion-negative-prompt

Jupyter Notebooks for experimenting with negative prompting with Stable Diffusion 2.0.
Jupyter Notebook
86
star
33

stylistic-word-clouds

Python scripts for creating stylistic word clouds
Python
85
star
34

gpt3-blog-title-optimizer

Python code for building a GPT-3 based technical blog post optimizer.
Jupyter Notebook
84
star
35

amazon-spark

R Code + R Notebook for analyzing millions of Amazon reviews using Apache Spark
HTML
83
star
36

twcloud

Python package + CLI to generate wordclouds of Twitter tweets.
Python
75
star
37

twitter-cloud-run

A (relatively) minimal configuration app to run Twitter bots on a schedule that can scale to unlimited bots.
Python
75
star
38

get-profile-data-of-repo-stargazers

This repository contains a script used to get the GitHub profile information of all the people who've Stared a given GitHub repository
Python
68
star
39

deep-learning-cpu-gpu-benchmark

Repository to benchmark the performance of Cloud CPUs vs. Cloud GPUs on TensorFlow and Google Compute Engine.
HTML
67
star
40

gpt-j-6b-experiments

Test prompts for GPT-J-6B and the resulting AI-generated texts
55
star
41

icon-image

Python script to quickly generate a Font Awesome icon imposed on a background for steering AI image generation.
Python
53
star
42

hacker-news-download-all-stories

Download *ALL* the submissions from Hacker News
Python
52
star
43

ml-data-generator

Python script to generate fake datasets optimized for testing machine learning/deep learning workflows
Python
51
star
44

clickbait-cluster

Code + Jupyter Notebooks for Visualizing Clusters of Clickbait Headlines Using Spark, Word2vec, and Plotly
HTML
47
star
45

keras-cntk-docker

Docker container for keras + cntk intended for nvidia-docker
Python
42
star
46

foursquare-venue-scraper

A Foursquare data scraper that gathers all venues within a specified geographic area.
Python
39
star
47

interactive-facebook-reactions

Jupyter notebook + Code for processing Facebook Reactions data and making Interactive Charts
HTML
38
star
48

youtube-video-scraper

Tools for scraping YouTube video metadata (mostly for training AI on video titles)
Python
35
star
49

nyc-taxi-notebook

R Code + Jupyter notebook for analyzing and visualizing NYC Taxi data
R
31
star
50

sdxl-experiments

Jupyter Notebooks for experimenting with Stable Diffusion XL 1.0
Jupyter Notebook
30
star
51

yelp-review-analysis

Repository containing script on how I processed and charted Yelp data.
R
29
star
52

subreddit-generator

Train a neural network optimized for generating Reddit subreddit posts
Python
28
star
53

predict-reddit-submission-success

Repository w/ Jupyter + R Notebooks for creating a model to predict the success of Reddit submissions with Keras.
HTML
28
star
54

tritonize

Convert images to a styled, minimal representation, quickly with NumPy
Python
28
star
55

langchain-problems

Demos of some issues with LangChain.
Jupyter Notebook
27
star
56

keras-cntk-benchmark

Code for Benchmarking CNTK performance on Keras vs. TensorFlow
Python
26
star
57

autotweet-from-googlesheet

A minimal proof-of-concept Python script to tweet human-curated Tweets on a schedule.
Python
26
star
58

frames-to-gif-osx

An application that allows the user to easily convert frames to very-high-quality GIFs on OS X.
26
star
59

minimaxir.github.io

Blog Posts and Theme for https://minimaxir.com
HTML
25
star
60

legaladvice-gpt2

Dump of generated texts from GPT-2 trained on /r/legaladvice subreddit titles
24
star
61

ggplot-tutorial

Repository for ggplot2 tutorial
R
23
star
62

sf-arrests-when-where

R Code + Jupyter notebook for replicating analysis of when and where arrests in San Francisco occur.
R
21
star
63

pokemon-3d

Code + Visualizations processing and visualizing Pokémon data in 3D
HTML
20
star
64

reddit-gpt-2-cloud-run

Reddit title generator API based on GPT-2
HTML
20
star
65

facebook-keyword-regression-analysis

Regression Analysis for Facebook keywords.
R
20
star
66

chatgpt-structured-data

Demos of ChatGPT's function calling/structured data support.
Jupyter Notebook
20
star
67

stylecloud-examples

Examples of stylistic word clouds generated via the stylecloud Python package
Python
19
star
68

stack-overflow-survey

Code + Visualizations for processing 2016 Stack Overflow Survey Data
Jupyter Notebook
19
star
69

get-bars-from-foursquare

A quick pair of Python scripts to retrieve all bars within a given area, then retrieve metadata and process it.
Python
19
star
70

subreddit-related

Code and visualizations for related/similar subreddits
Jupyter Notebook
18
star
71

get-heart-rate-csv

A small Python script to get the heart rate data generated from an Apple Watch in a CSV form
Python
18
star
72

ai-generated-magic-cards

Tools for encoding Magic: The Gathering cards into a form suitable for AI text generation
Python
17
star
73

tensorflow-multiprocess-ray

Proof of concept on how to use TensorFlow for prediction tasks in a multiprocess setting.
Python
17
star
74

pokemon-ai

A text-generating AI to generate Pokémon names.
Python
17
star
75

reddit-comment-length

R code needed to reproduce Relationship between Reddit Comment Score and Comment Length for 1.66 Billion Comments visualization
R
17
star
76

mtg-card-creator-api

Code for running a Magic card image generator API
Python
16
star
77

reddit-graph

Jupyter notebook + Code for reproducing Reddit Subreddit graphs
Jupyter Notebook
16
star
78

ncaa-basketball

R Code + R Notebook on how to process and visualize NCAA basketball data.
R
16
star
79

automl-gs-examples

Examples + Visualizations of datasets modeled using automl-gs
Python
15
star
80

chatgpt-tips-analysis

Jupyter Notebooks for testing the impact of tip incentives for ChatGPT
Jupyter Notebook
15
star
81

sfba-compensation

Jupyter notebook + Code for scraping AngelList data and making an interactive chart of SFBA salaries/equity
HTML
14
star
82

hacker-news-comment-analysis

Code used for analysis of Hacker News comments.
R
13
star
83

char-tsne-visualization

Visualizations of character embeddings from derived character vectors.
HTML
13
star
84

resetera-gpt-2

Scraper of ResetEra threads and posts to get them into a format suitable for feeding them into GPT-2.
Python
13
star
85

get-data-from-photos-from-instagram-tags

Processes data from images which are tagged with the specified Instagram tag.
Python
12
star
86

imdb-data-analysis

R Code + R Notebook on how to process and visualize the official IMDb datasets.
12
star
87

hn-heatmaps

Code and data necessary to reproduce heatmaps relating HN Submission time to submission score.
R
12
star
88

sf-crimes-covid

Spot checking impact of SF shelter-in-places on crime reporting.
12
star
89

gpt-2-fanfiction

Experiments with generating GPT-2 fanfiction on specified topics.
11
star
90

notebooks

This GitHub Repository stores my R Notebooks, allowing GitHub Pages to serve the R Notebooks on my website
HTML
11
star
91

imgur-decline

R Code + R Notebook for analyzing the decline of Imgur on Reddit.
HTML
11
star
92

all-marvel-comics-characters

Creates a .csv of all Marvel Comics Characters + Statistics via the Marvel API
Python
10
star
93

movie-gender

Data and code for analyzing Movie Lead Gender.
Jupyter Notebook
10
star
94

online-class-charts

Code needed to reproduce data analysis and charts for MIT/Harvard Online Course Data
R
9
star
95

ggplot2-web

R Code + R Notebook on how to make high quality data visualizations on the web with ggplot2.
HTML
9
star
96

reddit-subreddit-keywords

Code + Jupyter notebook for analyzing and visualizing means and medians of keywords in the top Reddit Subreddits.
R
8
star
97

reddit-mean-score

Quick data visualization for Reddit Mean Submission Score by Subreddit
8
star
98

sf-arrests-predict

R Code + R Notebook for predicting arrest types in San Francisco.
HTML
8
star
99

breach-network

R Code + R Notebook for creating an interactive graph network of Have I Been Pwned data using R and Plotly.
HTML
8
star
100

modeling-link-aggregators

R Code + R Notebook on how to process and visualize both Reddit and Hacker News data.
8
star