• Stars
    star
    143
  • Rank 257,007 (Top 6 %)
  • Language
    Jupyter Notebook
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Generating paper titles (and more!) with GPT trained on data scraped from arXiv.

Forecasting the progress of research is an elusive and important goal. Here, we take a toy step towards this goal by exploring generating new scientific paper titles given past titles on arXiv:

  1. We generate titles conditioned on a specific author (using GPT-3 without finetuning)
  2. We generate titles conditioned on their publication year (using GPT-Neo with finetuning)
  3. We evaluate the generated titles to see how well they match new, recent paper titles

1   Author-specific paper titles (prompting gpt3)

To generate author-specific titles, we take the five most recent titles from each author with atleast 3 arXiv AI papers (cs.ML, cs.LG, stat.ML). We then format the papers using the following template and query for a new title using GPT-3:

Here is a list of related machine-learning papers:

> [title 1]
> [title 2]
...
> [title 5]
> ____

See the results in the demo above or the full results in this json file.

Here's a concrete example -- when prompting with these 5 recent titles:

> Hierarchical Shrinkage: improving the accuracy and interpretability of tree-based methods
> Fast Interpretable Greedy-Tree Sums (FIGS)
> Adaptive wavelet distillation from neural networks through interpretations
> Emb-GAM: an Interpretable and Efficient Predictor using Pre-trained Language Models
> Explaining Patterns in Data with Language Models via Interpretable Autoprompting
> ____

We get these 5 (independent) random generations for the blank:

1. Towards Interpretable Natural Language Processing: A Survey
2. A Unified Framework for Interpretable Machine Learning
3. Compositional Attention Networks for Machine Reasoning
4. Achieving Open Vocabulary Neural Machine Translation
5. A Deep Understanding of Neural Networks through Deep Visualization

The results are often interesting but fall into failure modes where they generate irrelevant titles for an author, often leaning towards popular topics such as deep learning, multi-task learning, and reinforcement learning. Note: the model used was GPT-3 text-davinci-002 from the OpenAI API on Oct 14 2022. It likely was not up to date with the most current advances and could be improved with finetuning on more recent titles. During preprocessing, paper titles with irregular formatting were removed and distinct authors with exactly the same name were not differentiated.

2   Finetuned paper title generation (gptneo)

To improve the model's ability to generate cogent titles, we finetune it on a large corpuse of titles. We start from the gpt-neo-2.7B checkpoint (see our training script for hyperparameters). We finetune on all paper titles on arXiv in the categories cs.AI, cs.LG, stat.ML up to Oct 13, 2022. We exclude all papers after Apr 1, 2022 (to test the ability to forecast new papers) and an additional random 5% of titles. We also exclude titles with a length of less than 6 words or greater than 20 words. This results in 98,388 papers for finetuning:

Samples After finetuning, here are some example titles generated by the model, conditioned on different years (see a large dump in this folder):

2022

  • Diverse Datasets for Learning to Rank via Knowledge Graph Embedding
  • Machine learning-driven method for high-throughput single-cell analysis of differentiation and lineage commitment
  • On the Sample Complexity of Differentially Private Learning
  • Data-Dependent Weight Normalization for Improved Image Resolution
  • Adaptive Densely Connected Networks for the Generation and Visualization of Object Deformations
  • Exploring the Implicit Bias in Transfer Learning using Imitation Learning

2023 (These samples tend to just be similar to 2021/2022 where the majority of the training data lies)

  • An Interpretable Dynamic Network for Spatiotemporal Pattern Prediction in High-Dimensional Time Series Data
  • Multimodal Deep Learning for Automated Cancer Histopathology Analysis
  • Reinforced Learning for Robust and Accurate Object Detection
  • A Machine Learning Approach to High Sensitivity Data Processing
  • Adversarial Robustness for Graph Neural Networks in Network Intrusion Detection
  • Reinforcement Learning via Exploration and Rehearsal for Learning from Demonstrations

2010 (Seems to properly generate older titles)

  • Learning in a Dynamic, Clustered and Homogeneous Heterogeneous Markov Decision Process
  • An Empirical Analysis of the Regularization of the Gaussian Process Regression
  • A Scalable Clustering Algorithm under Heterogeneous Data
  • A Unified Representation for Probabilistic Time Series Forecasting
  • A Hybrid Approach to Automatic Alignment and Localization of Digital Object Platforms
  • Bayesian nonparametric modeling of random fields

Inference example We've released our finetuned model if you want to play with it:

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
model = AutoModelForCausalLM.from_pretrained("csinva/gpt-neo-2.7B-titles")
tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neo-2.7B")
pipe = pipeline('text-generation', model=model, tokenizer=tokenizer)
pipe('2022\n\n')
------------------
> Automating the Visualization of Convolutional Neural Networks

During finetuning each paper title was given in the format <year>\n\n <title>\n (e.g. 2020\n\n Interpretations are useful: penalizing explanations to align neural networks with prior knowledge\n). The same format should be used for inference. These samples are considerably better than the samples we made with GPT2 back in 2019 (the good old days).

3   Generated paper evaluation

We now evaluate whether the generated titles for 2022 match the real paper titles from the test set (April 1 - Oct 13 2022). Note that the model has never seen any papers from this time period and it's pre-training corpus also only contained text from before 2022. We generate 5,000 titles and find for the closest match for each of them in the test set (which contains ~15,000 titles). The resulting BLEU scores are shown in this figure:

Here's a table of the first 5 matches. See if you can guess which are the real titles and which are generated (answers below):

| A | B | | ---- | ------------------------------------------------------------ | ------------------------------------------------------------ | | Understanding the effect of data augmentation in generative adversarial networks | Understanding the effect of data augmentation in self-supervised anomaly detection | | Adversarial attacks on graph neural networks | Sparse vicious attacks on graph neural networks | | Differentiable reinforcement learning for continuous control | Normality-guided distributional reinforcement learning for continuous control | | Multilevel representation learning for time series forecasting | Out-of-distribution representation learning for time series classification | | Unsupervised feature learning for medical image segmentation | Distributed contrastive learning for medical image segmentation |

Answers sǝlʇᴉʇ lɐǝɹ ǝɥʇ suᴉɐʇuoɔ ꓭ uɯnloƆ The generated titles often seem to be overly general, missing the detailed specificity of the real titles (e.g. "Sparse vicious attacks" rather than "Adversarial attacks").

Some possible followups This post was very limited, but there are a bunch of directions to explore to see how well language models can really forecast scientific titles (and scientific progress in general). Here are some straightforward followups:

  • Use information about abstracts instead of just titles
  • Get the language model to explain why it generated a particular title (probably grounded in abstract)
  • Build a model to classify year given paper title and then use iPrompt to describe the year-to-year differences
  • Improve author-specific title generation with finetuning (some authors have a lot of papers)

Reference

  • Code here
  • arXiv dataset from here
  • Adorable robot from here

More Repositories

1

imodels

Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
Jupyter Notebook
1,369
star
2

csinva.github.io

Slides, paper notes, class notes, blog posts, and research on ML 📉, statistics 📊, and AI 🤖.
HTML
507
star
3

imodelsX

Scikit-learn friendly library to interpret, and prompt-engineer text datasets using large language models.
Python
158
star
4

gan-vae-pretrained-pytorch

Pretrained GANs + VAEs + classifiers for MNIST/CIFAR in pytorch.
Jupyter Notebook
155
star
5

hierarchical-dnn-interpretations

Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
Jupyter Notebook
124
star
6

iprompt

Finding semantically meaningful and accurate prompts.
Jupyter Notebook
46
star
7

tree-prompt

Tree prompting: easy-to-use scikit-learn interface for improved prompting.
Jupyter Notebook
27
star
8

disentangled-attribution-curves

Using / reproducing DAC from the paper "Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees"
Python
25
star
9

matching-with-gans

Matching in GAN latent space for better bias benchmarking and semantic image editing. 👶🏻🧒🏾👩🏼‍🦰👱🏽‍♂️👴🏾
Jupyter Notebook
20
star
10

data-viz-utils

Functions for easily making publication-quality figures with matplotlib.
Jupyter Notebook
18
star
11

mdl-complexity

MDL Complexity computations and experiments from the paper "Revisiting complexity and the bias-variance tradeoff".
Jupyter Notebook
17
star
12

interpretable-embeddings

Interpretable text embeddings by asking LLMs yes/no questions
Python
16
star
13

transformation-importance

Using / reproducing TRIM from the paper "Transformation Importance with Applications to Cosmology" 🌌 (ICLR Workshop 2020)
Jupyter Notebook
8
star
14

iai-clinical-decision-rule

Interpretable clinical decision rules for predicting intra-abdominal injury.
Jupyter Notebook
8
star
15

cookiecutter-ml-research

A logical, reasonably standardized, but flexible project structure for conducting ml research.
Jupyter Notebook
8
star
16

glaucoma-diagnosis

Code for diagnosing glaucoma from Lumos lens
Python
7
star
17

clinical-rule-analysis

Analyzing clinical decision instruments through the lens of data and large language models.
Jupyter Notebook
6
star
18

tree-prompt-experiments

Create a tree of prompts during training that improves efficiency and accuracy.
Jupyter Notebook
4
star
19

dnn-ensemble

Testing the properties of ensembled neural networks.
Jupyter Notebook
4
star
20

news-balancer

News Balancer takes a story and provides articles on that story with credibility and varying political bias. The homepage will randomly generate a story from its archives, but a user can type in a query to get stories relating to their query along with their credibility / political bias.
Python
4
star
21

tpr-fmri

Python
3
star
22

abide-multitask-learning

Multi-task learning of functional connectivity on the ABIDE dataset.
Jupyter Notebook
3
star
23

local-vae

Making locally disentangled vaes.
Jupyter Notebook
3
star
24

neural-spike-sorting

Experimental code for performing spike sorting using a neural network.
Jupyter Notebook
3
star
25

trees-to-networks

Bridging random forests and deep neural networks. Partial implementation of "Neural Random Forests" https://arxiv.org/abs/1604.07143
Jupyter Notebook
3
star
26

acronym-generator

Generator acronyms given a sequence of words (useful for making paper titles).
HTML
3
star
27

imodels-playground

Demos for visualizing how rule-based models work.
TypeScript
2
star
28

max-activation-interpretation-pytorch

Code for creating maximal activation images (like Deep Dream) in pytorch with various regularizations / losses.
Jupyter Notebook
2
star
29

hummingbird-tracking

Code for tracking various things in hummingbird video
Python
2
star
30

neuronforest-analysis-scripts

Python scripts to replace Matlab for evaluation of error in connectome images and affinity graphs.
Python
2
star
31

pyfim-clone

Clone of pyfim making it installable as a dependency. Copied from http://www.borgelt.net/pyfim.html
C
2
star
32

scattering-transform-experiments

Repository for experiments with scattering transforms
Jupyter Notebook
2
star
33

imodels-data

Preprocessed data for various popular tabular datasets to go along with imodels.
Jupyter Notebook
2
star
34

mouse-brain-decoding

Decoding images from calcium recordings using data from stringer et al. 2018.
Jupyter Notebook
2
star
35

stable-interpretation

Exploring ways to extract stable interpretations from neural networks.
Jupyter Notebook
2
star
36

dnn-experiments

A set of scripts and experiments making it easier to analyze deep learning empirically.
Jupyter Notebook
2
star
37

arxiv-copier

Extension for copying the title + url of an arXiv page via right click
JavaScript
1
star
38

news-title-bias

Scraping and analyzing political bias in news titles using data from allsides.com
HTML
1
star
39

axon-ap-propagation

Code for simulations of action potential propagation
AMPL
1
star
40

younet

Learning natural language models based on personalized messages.
Python
1
star
41

mini-games

Code for simple games made in java + google sheets.
Java
1
star
42

global-sports-analysis

Analyzing how different factors influence global sports rankings
HTML
1
star
43

pybaobab-fork

Fork of pybaobabdt adding more customization.
Jupyter Notebook
1
star