• Stars
    star
    151
  • Rank 246,057 (Top 5 %)
  • Language
    Shell
  • License
    MIT License
  • Created about 5 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Set up the CTRL text-generating model on Google Compute Engine with just a few console commands.

Install and Use CTRL on Google Compute Engine

Scripts + guides on how to set up a virtual machine on Google Compute Engine capable of running and using Salesforce's very large text-generating model CTRL to generate high-quality text based on conditional parameters.

The CTRL model is so large (12 GB on disk, 15.5 GB GPU VRAM when loaded, even more system RAM during runtime) that it will currently not fit into a free Colaboratory or Kaggle Notebook. Therefore, this setup is necessary to play with the model for now.

Machine Setup Instructions

The VM these instructions create is the minimum, lowest-cost configuration powerful enough to run CTRL without going out-of-memory (P100 GPU, 8 vCPU, 30 GB RAM, preemptible). With this configuration, having the VM up will cost $0.51/hr.

  1. Make sure the gcloud command line tool is set up on your local computer and up-to-date (can update via gcloud components update).
  2. Make sure your Google Cloud Platform project tied to your local computer's gcloud has enough quota in the us-central-1 region (8 CPUs and 1 P100; these should be available by default, but request more quota if they aren't)
  3. Make sure your GCE project has billing set up.
  4. On your local computer, run this gcloud command in a terminal which creates a VM with the specs noted above:
gcloud compute instances create ctrl \
  --zone us-central1-c \
  --boot-disk-size 45GB \
  --image-family tf-latest-gpu \
  --image-project deeplearning-platform-release \
  --maintenance-policy TERMINATE \
  --machine-type n1-standard-8 \
  --metadata install-nvidia-driver=True \
  --preemptible \
  --accelerator='type=nvidia-tesla-p100,count=1'
  

You can view the created instance, Start/Stop it, and delete it, in the Google Compute Engine dashboard.

Once created (after waiting a bit for the GPU drivers to install), SSH into the instance. The recommended way to do is via the gcloud command created from the SSH dropdown in the GCE dashboard, which will look like:

gcloud beta compute --project "<PROJECT ID>" ssh --zone "us-central1-c" "ctrl"

While SSHed into the VM you created, download and run the install_gce.sh script from this repo via:

curl -O -s https://raw.githubusercontent.com/minimaxir/ctrl-gce/master/install_gce.sh
sudo sh install_gce.sh

If you'd like to use the model with a sequence length of 512, you can pass 512 as an argument to the install script, e.g.

sudo sh install_gce.sh 512

You're done! Make sure to Stop the instance in the GCE dashboard when you are finished generating text!

Using CTRL

For basic usage, running the command below in the VM will load the model and eventually start an interactive prompt.

sudo python generation.py --model_dir seqlen256_v1.ckpt/

If you are using the 512-length model, instead you would do:

sudo python generation.py --model_dir seqlen512_v1.ckpt/ --generate_num 512

While generating, you can specify a KeyboardInterrupt to stop generation (Ctrl+C on macOS). It's also recommended to clear the terminal (CMD+K on macOS) occasionally as the entire generated text will be output after each added token.

You must include a control code with each interactive prompt. You can see how the control codes are used in the original paper, or refer to the following examples:

Links

The Links control code allows you to specify a URL and/or a prompt text and have the model attempt to extrapolate the corresponding article. Examples:

Links https://www.buzzfeednews.com/article/annehelenpetersen/jeremy-renner-app-trolling-career-hurt-locker-avengers

The actor who played the villain Loki in Avengers: Infinity War has been accused of being a troll. 
 
Actor Jeremy Renner was criticized for his role as Loki, an antagonist from Marvel's upcoming Avengers movie, after he tweeted that fans should "stop trying to make me look bad." 
 
In response to Renner's tweet, many people on Twitter said they were offended by what they perceived as trolling.

Links https://www.nytimes.com/2020/08/29/us/politics/skynet-ai-online-norad.html

Site Mobile Navigation 
 
A New Online Course for the Super-Smart 
 
The University of Washington is offering a new online course on artificial intelligence that will be taught by an expert in machine learning and neural networks. 
 
The course, called “Skynet,” was developed with help from Google, Facebook, Microsoft and other companies. It has been designed to teach students how to use deep learning, which uses computer programs modeled after neurons to analyze large amounts of data — like images or videos — using only basic rules about what each pixel represents. 
 
In addition to teaching them how to build their own programs, it also teaches them how to apply those algorithms to real-world problems. 

Questions

The Questions control code allows you to specify a question; CTRL will attempt to follow up from that question (your mileage will vary). Examples:

Questions Q: Who is the Zodiac Killer?

A: charles manson 

Questions Q: What is the last digit of pi?

A: 0

Reddit Subreddits

Each Reddit subreddit has a different control code; see the bottom of the paper for a full list. You can optionally specify the Title or the self-post Text body. Examples:

Legal Title: [FL] Is it legal to get into legal trouble?

Text: I was driving home from work and saw a police car pull up behind me. The officer asked me for my license and registration. He then proceeded to ask if i had any drugs in the car. 
 
My question is this: 
 
1. Can you be arrested without probable cause or even being charged with anything at all? 
 
2. If so, what are your chances of getting out of this situation? 
 
3. What happens when they arrest you? 
 
4. How do you go about proving that there were no drugs? 
 
5. Do you have an attorney who can help me? 
 
Thanks 
 
Edit: 
 
Thank you all so much for your responses. This has been very helpful as I'm trying to figure out how best to proceed. 

Opinion Text: Recycling is bad for the world.

It takes a lot of energy to recycle and it creates more waste than it saves. 
 
I think that recycling should be banned because we are destroying our planet with it. 
 
Edit: Thanks everyone who has given me some good arguments against recycling. 
 
Edit 2: I'm not saying that people shouldn't recycle; it's just that they need to do so in an environmentally friendly way.

Amazon Reviews

The Reviews tag can be used to generate Amazon reviews at an optional specified rating or starting text.

Reviews Rating: 1.0

I bought this book because it was recommended by a friend. It is not worth the money. The author has no credentials and his writing style is very poor.

Reviews Rating: 5.0\n\nI died

a little inside when I saw the first page of this book. It was so beautiful and it made me feel like I could do anything. But then I read on to see what happened next. And there were no more pages. The book just stopped. No epilogue, nothing. Just an abrupt ending. I'm not sure if it's because there's another one coming out or what, but that's how I feel. It's almost as though she got tired of writing about her life in New York City and decided that she'd write something else instead.

Command Line Arguments

Unlike other text-generating apps, CTRL has a default temperature of 0, meaning the model chooses the best guess when possible (before repetition penalty is applied). Some CLI arguments you can add:

  • --generate_num — Number of tokens to generate (default: 256, can exceed the model window)
  • --temperature — Controls model creativity (default: 0, may want to increase to 0.2)
  • --nucleus — Controls cutoff for nucleus/top-p sampling (default: 0, may want to set to 0.9)
  • --topk — Controls cutoff for top-k sampling (default: 0)
  • --penalty — Repetition penalty factor (default: 1.2)

Notes

  • Since the model is huge, generation is very slow: about 2 tokens per second with the configuration above. (therefore, it takes about 2 minutes for a full generation with default parameters)
  • The BPEs CTRL uses are "longer" that those used in GPT-2. As a result, a 256-token generation in CTRL is about the same decoded length as a 1024-token generation in GPT-2.
  • When using the Links control code, keep in mind that code is conditioned on OpenWebText, which is conditioned on Reddit data. Therefore, there's a bias toward English websites and Reddit-friendly content. Here's a quick spreadsheet of the most popular domains on Reddit, sans some obvious image-oriented websites.
  • If CTRL gets confused by the Links URL, it tends to fall back to a more general news-oriented output.
  • It is recommended to use Google Compute Engine (even if you aren't following this guide) as the model itself is hosted in Google Cloud Storage and thus it's relatively fast to transfer to a VM (>100 Mb/s), and also lowers the cost for Salesforce.

TODO

  • Support/Test domain detection.

Maintainer/Creator

Max Woolf (@minimaxir)

Max's open-source projects are supported by his Patreon. If you found this project helpful, any monetary contributions to the Patreon are appreciated and will be put to good creative use.

Special Thanks

Adam King for identifying a working implementation of loading the model after unexplained setbacks.

License

MIT

Disclaimer

This repo has no affiliation or relationship with the CTRL team and/or Salesforce.

More Repositories

1

big-list-of-naughty-strings

The Big List of Naughty Strings is a list of strings which have a high probability of causing issues when used as user-input data.
Python
46,104
star
2

textgenrnn

Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.
Python
4,941
star
3

hacker-news-undocumented

Some of the hidden norms about Hacker News not otherwise covered in the Guidelines and the FAQ.
3,616
star
4

simpleaichat

Python package for easily interfacing with chat apps, with robust features and minimal code complexity.
Python
3,463
star
5

gpt-2-simple

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts
Python
3,402
star
6

facebook-page-post-scraper

Data scraper for Facebook Pages, and also code accompanying the blog post How to Scrape Data From Facebook Page Posts for Statistical Analysis
Python
2,116
star
7

person-blocker

Automatically "block" people in images (like Black Mirror) using a pretrained neural network.
Python
2,022
star
8

automl-gs

Provide an input CSV and a target field to predict, generate a model + code to run it.
Python
1,845
star
9

aitextgen

A robust Python tool for text-based AI training and generation using GPT-2.
Python
1,831
star
10

stylecloud

Python package + CLI to generate stylistic wordclouds, including gradients and icon shapes!
Python
825
star
11

gpt-3-experiments

Test prompts for OpenAI's GPT-3 API and the resulting AI-generated texts.
Python
702
star
12

video-to-gif-osx

A set of utilities that allow the user to easily convert video files to very-high-quality GIFs on OS X.
Shell
395
star
13

copy-syntax-highlight-osx

Copy Syntax Highlight for OS X is an OS X service which copies the selected text to the clipboard, with proper syntax highlighting for the given language.
381
star
14

gpt-2-cloud-run

Text-generation API via GPT-2 for Cloud Run
HTML
313
star
15

reactionrnn

Python module + R package to predict the reactions to a given text using a pretrained recurrent neural network.
Python
299
star
16

gpt-2-keyword-generation

Method to encode text for GPT-2 to generate text based on provided keywords
Python
260
star
17

download-tweets-ai-text-gen

Python script to download public Tweets from a given Twitter account into a format suitable for AI text generation.
Python
220
star
18

tweet-generator

Train a neural network optimized for generating tweets based off of any number of Twitter users.
Python
218
star
19

char-embeddings

A repository containing 300D character embeddings derived from the GloVe 840B/300D dataset, and uses these embeddings to train a deep learning model to generate Magic: The Gathering cards using Keras
Python
214
star
20

magic-the-gifening

A Twitter bot which tweets Magic: the Gathering cards with appropriate GIFs superimposed onto them.
Python
212
star
21

system-dashboard

Minimalist Win/OSX/Linux System Dashboard using Flask and Freeboard
HTML
200
star
22

imgmaker

Create high-quality images programmatically with easily-hackable templates.
Python
175
star
23

ai-generated-pokemon-rudalle

Python script to preprocess images of all Pokémon to finetune ruDALL-E
Python
138
star
24

imgbeddings

Python package to generate image embeddings with CLIP without PyTorch/TensorFlow
Python
134
star
25

mtg-gpt-2-cloud-run

Code and UI for running a Magic card text generator API via GPT-2
HTML
120
star
26

get-all-hacker-news-submissions-comments

Simple Python scripts to download all Hacker News submissions and comments and store them in a PostgreSQL database.
Python
119
star
27

hacker-news-gpt-2

Dump of generated texts from GPT-2 trained on Hacker News titles
117
star
28

facebook-ad-library-scraper

A Python scraper for the Facebook Ad Library, using the official Facebook Ad Library API.
Python
114
star
29

reddit-bigquery

Code + Jupyter notebook for analyzing and visualizing Reddit Data quickly and easily
R
112
star
30

optillusion-animation

Python code to submit rotated images to the Cloud Vision API + R code for visualizing it
Python
99
star
31

chatgpt_api_test

Demos utilizing the ChatGPT API
Jupyter Notebook
96
star
32

gpt-3-client

A client for OpenAI's GPT-3 API for ad hoc testing of prompt without using the web interface.
Python
90
star
33

stable-diffusion-negative-prompt

Jupyter Notebooks for experimenting with negative prompting with Stable Diffusion 2.0.
Jupyter Notebook
87
star
34

stylistic-word-clouds

Python scripts for creating stylistic word clouds
Python
85
star
35

gpt3-blog-title-optimizer

Python code for building a GPT-3 based technical blog post optimizer.
Jupyter Notebook
83
star
36

amazon-spark

R Code + R Notebook for analyzing millions of Amazon reviews using Apache Spark
HTML
83
star
37

twcloud

Python package + CLI to generate wordclouds of Twitter tweets.
Python
76
star
38

twitter-cloud-run

A (relatively) minimal configuration app to run Twitter bots on a schedule that can scale to unlimited bots.
Python
76
star
39

deep-learning-cpu-gpu-benchmark

Repository to benchmark the performance of Cloud CPUs vs. Cloud GPUs on TensorFlow and Google Compute Engine.
HTML
67
star
40

get-profile-data-of-repo-stargazers

This repository contains a script used to get the GitHub profile information of all the people who've Stared a given GitHub repository
Python
67
star
41

icon-image

Python script to quickly generate a Font Awesome icon imposed on a background for steering AI image generation.
Python
53
star
42

gpt-j-6b-experiments

Test prompts for GPT-J-6B and the resulting AI-generated texts
53
star
43

ml-data-generator

Python script to generate fake datasets optimized for testing machine learning/deep learning workflows
Python
51
star
44

hacker-news-download-all-stories

Download *ALL* the submissions from Hacker News
Python
51
star
45

clickbait-cluster

Code + Jupyter Notebooks for Visualizing Clusters of Clickbait Headlines Using Spark, Word2vec, and Plotly
HTML
47
star
46

keras-cntk-docker

Docker container for keras + cntk intended for nvidia-docker
Python
42
star
47

foursquare-venue-scraper

A Foursquare data scraper that gathers all venues within a specified geographic area.
Python
39
star
48

interactive-facebook-reactions

Jupyter notebook + Code for processing Facebook Reactions data and making Interactive Charts
HTML
38
star
49

youtube-video-scraper

Tools for scraping YouTube video metadata (mostly for training AI on video titles)
Python
38
star
50

nyc-taxi-notebook

R Code + Jupyter notebook for analyzing and visualizing NYC Taxi data
R
31
star
51

sdxl-experiments

Jupyter Notebooks for experimenting with Stable Diffusion XL 1.0
Jupyter Notebook
30
star
52

yelp-review-analysis

Repository containing script on how I processed and charted Yelp data.
R
29
star
53

langchain-problems

Demos of some issues with LangChain.
Jupyter Notebook
29
star
54

subreddit-generator

Train a neural network optimized for generating Reddit subreddit posts
Python
28
star
55

predict-reddit-submission-success

Repository w/ Jupyter + R Notebooks for creating a model to predict the success of Reddit submissions with Keras.
HTML
28
star
56

autotweet-from-googlesheet

A minimal proof-of-concept Python script to tweet human-curated Tweets on a schedule.
Python
27
star
57

tritonize

Convert images to a styled, minimal representation, quickly with NumPy
Python
27
star
58

keras-cntk-benchmark

Code for Benchmarking CNTK performance on Keras vs. TensorFlow
Python
26
star
59

frames-to-gif-osx

An application that allows the user to easily convert frames to very-high-quality GIFs on OS X.
26
star
60

minimaxir.github.io

Blog Posts and Theme for https://minimaxir.com
HTML
25
star
61

ggplot-tutorial

Repository for ggplot2 tutorial
R
24
star
62

legaladvice-gpt2

Dump of generated texts from GPT-2 trained on /r/legaladvice subreddit titles
23
star
63

chatgpt-structured-data

Demos of ChatGPT's function calling/structured data support.
Jupyter Notebook
22
star
64

sf-arrests-when-where

R Code + Jupyter notebook for replicating analysis of when and where arrests in San Francisco occur.
R
22
star
65

pokemon-3d

Code + Visualizations processing and visualizing Pokémon data in 3D
HTML
21
star
66

reddit-gpt-2-cloud-run

Reddit title generator API based on GPT-2
HTML
20
star
67

facebook-keyword-regression-analysis

Regression Analysis for Facebook keywords.
R
20
star
68

chatgpt-tips-analysis

Jupyter Notebooks for testing the impact of tip incentives for ChatGPT
Jupyter Notebook
20
star
69

stylecloud-examples

Examples of stylistic word clouds generated via the stylecloud Python package
Python
19
star
70

stack-overflow-survey

Code + Visualizations for processing 2016 Stack Overflow Survey Data
Jupyter Notebook
19
star
71

get-heart-rate-csv

A small Python script to get the heart rate data generated from an Apple Watch in a CSV form
Python
19
star
72

get-bars-from-foursquare

A quick pair of Python scripts to retrieve all bars within a given area, then retrieve metadata and process it.
Python
19
star
73

subreddit-related

Code and visualizations for related/similar subreddits
Jupyter Notebook
19
star
74

ai-generated-magic-cards

Tools for encoding Magic: The Gathering cards into a form suitable for AI text generation
Python
17
star
75

tensorflow-multiprocess-ray

Proof of concept on how to use TensorFlow for prediction tasks in a multiprocess setting.
Python
17
star
76

pokemon-ai

A text-generating AI to generate Pokémon names.
Python
17
star
77

reddit-comment-length

R code needed to reproduce Relationship between Reddit Comment Score and Comment Length for 1.66 Billion Comments visualization
R
17
star
78

mtg-card-creator-api

Code for running a Magic card image generator API
Python
16
star
79

automl-gs-examples

Examples + Visualizations of datasets modeled using automl-gs
Python
16
star
80

reddit-graph

Jupyter notebook + Code for reproducing Reddit Subreddit graphs
Jupyter Notebook
16
star
81

ncaa-basketball

R Code + R Notebook on how to process and visualize NCAA basketball data.
R
16
star
82

pokemon-embeddings

Jupyter Notebooks and an R Notebook for encoding Pokémon embeddings and creating data visualizations.
Jupyter Notebook
16
star
83

sfba-compensation

Jupyter notebook + Code for scraping AngelList data and making an interactive chart of SFBA salaries/equity
HTML
14
star
84

resetera-gpt-2

Scraper of ResetEra threads and posts to get them into a format suitable for feeding them into GPT-2.
Python
14
star
85

get-data-from-photos-from-instagram-tags

Processes data from images which are tagged with the specified Instagram tag.
Python
13
star
86

hacker-news-comment-analysis

Code used for analysis of Hacker News comments.
R
13
star
87

char-tsne-visualization

Visualizations of character embeddings from derived character vectors.
HTML
13
star
88

imdb-data-analysis

R Code + R Notebook on how to process and visualize the official IMDb datasets.
12
star
89

hn-heatmaps

Code and data necessary to reproduce heatmaps relating HN Submission time to submission score.
R
12
star
90

sf-crimes-covid

Spot checking impact of SF shelter-in-places on crime reporting.
12
star
91

imgur-decline

R Code + R Notebook for analyzing the decline of Imgur on Reddit.
HTML
11
star
92

gpt-2-fanfiction

Experiments with generating GPT-2 fanfiction on specified topics.
11
star
93

notebooks

This GitHub Repository stores my R Notebooks, allowing GitHub Pages to serve the R Notebooks on my website
HTML
11
star
94

all-marvel-comics-characters

Creates a .csv of all Marvel Comics Characters + Statistics via the Marvel API
Python
10
star
95

movie-gender

Data and code for analyzing Movie Lead Gender.
Jupyter Notebook
10
star
96

online-class-charts

Code needed to reproduce data analysis and charts for MIT/Harvard Online Course Data
R
9
star
97

ggplot2-web

R Code + R Notebook on how to make high quality data visualizations on the web with ggplot2.
HTML
9
star
98

reddit-subreddit-keywords

Code + Jupyter notebook for analyzing and visualizing means and medians of keywords in the top Reddit Subreddits.
R
8
star
99

reddit-mean-score

Quick data visualization for Reddit Mean Submission Score by Subreddit
8
star
100

sf-arrests-predict

R Code + R Notebook for predicting arrest types in San Francisco.
HTML
8
star