There are no reviews yet. Be the first to send feedback to the community and the maintainers!
gpt-neo
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed librarieslm-evaluation-harness
A framework for few-shot evaluation of language models.pythia
The hub for EleutherAI's work on interpretability and learning dynamicsthe-pile
math-lm
cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.polyglot
Polyglot: Large Language Models of Well-balanced Competence in Multi-languagesDALLE-mtf
Open-AI's DALL-E for large scale training in mesh-tensorflow.vqgan-clip
sae
Sparse autoencodersconcept-erasure
Erasing concepts from neural representations with provable guaranteeselk
Keeping language models honest by directly eliciting knowledge encoded in their activations.oslo
OSLO: Open Source for Large-scale Optimizationlm_perplexity
knowledge-neurons
A library for finding knowledge neurons in pretrained transformer models.pyfra
Python Research Frameworkdps
Data processing system for polyglotopenwebtext2
info
(Deprecated) A hub for onboarding & other information.improved-t5
Experiments for efforts to train a new and improved t5stackexchange-dataset
Python tools for processing the stackexchange data dumps into a text dataset for Language Modelsproject-menu
See the issue board for the current status of active and prospective projects!magiCARP
One stop shop for all things carpsae-auto-interp
semantic-memorization
tqdm-multiprocess
Using queues, tqdm-multiprocess supports multiple worker processes, each with multiple tqdm progress bars, displaying them cleanly through the main process. It offers similar functionality for python logging.aria
rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)features-across-time
Understanding how features learned by neural networks evolve throughout trainingmp_nerf
Massively-Parallel Natural Extension of Reference Frameelk-generalization
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hardpile-pubmedcentral
A script for collecting the PubMed Central dataset in a language modelling friendly format.best-download
URL downloader supporting checkpointing and continuous checksumming.polyglot-data
data related codebase for polyglot projectaria-amt
Efficient and robust implementation of seq-to-seq automatic piano transcription.text-generation-testing-ui
Web app for demoing the EAI modelsexploring-contrastive-topology
mdl
Minimum Description Length probing for neural network representationspile_dedupe
Pile Deduplication Codew2s
pilev2
distilling
Experiments with distilling large language models.tokengrams
Efficiently computing & storing token n-grams from large corporalm-eval2
equivariance
A framework for implementing equivariant DLradioactive-lab
Adapting the "Radioactive Data" paper to work for text modelspile-literotica
Download, parse, and filter data from Literotica. Data-ready for The-Pile.hn-scraper
tagged-pile
Part-of-Speech Tagging for the Pile and RedPajamamultimodal-fid
pile-uspto
A script for collecting the USPTO Backgrounds dataset in a language modelling friendly format.pile-cc-filtering
The code used to filter CC data for The Pileminetest-baselines
Baseline agents for Minetest tasks.CodeCARP
Data collection pipeline for CodeCARP. Includes PyCharm plugins.pile-enron-emails
A script for collecting the Enron Emails dataset in a language modelling friendly format.pile-explorer
For exploring the data and documenting its limitationsminetest-interpretabilty-notebook
Jupyter notebook for the interpretablity section of the minetester blog postthonkenizers
yeseleutherai.github.io
This is the Hugo generated website for eleuther.ai. The source of this build is new-website repo.visual-grounding
Visually ground GPT-Neo 1.3b and 2.7bLLM-Markov-Chains
Project github for LLM Markov Chains Projectarchitecture-experiments
Repository to host architecture experiments and development using Paxml and Praxisllemma-sample-explorer
Sample explorer tool for the Llemma models.lm-scope
latent-video-diffusion
Latent video diffusionmegatron-3d
website
New website for EleutherAI based on Hugo static site generatorUnpaired-Image-Generation
Project Repo for Unpaired Image Generation projectccs
isaac-mchorse
EleutherAI's discord botpile-allpoetry
Scraper to gather poems from allpoetry.comEvilModel
A replication of "EvilModel 2.0: Bringing Neural Network Models into Malware Attacks"eai-prompt-gallery
Library of interesting prompt generationsvariance-across-time
Studying the variance in neural net predictions across training timepile-ubuntu-irc
A script for collecting the Ubuntu IRC dataset in a language modelling friendly format.reddit-comment-processing
eleutherai-instruct-dataset
A large instruct dataset for open-source models (WIP).bucket-cleaner
A small utility to clear out old model checkpoints in Google Cloud Buckets whilst keeping tensorboard event filesgroupoid-rl
equinox-llama
Equinox implementation of llama3 and llama3.1optax-galore
Adds GaLore style projection wrappers to optax optimizerslang-filter
Filter text files or archives by languageeleuther-blog
here is the generated content for the EleutherAI blog. Source is from new-website repoprefix-free-tokenizer
A prefix free tokenizeralignment-reader
Search and filter through alignment literaturegrouch
language-adaptation
perceptors
central location for access to pretrained models for CLIP and variants, with common API and out-of-the-box differentiable weighted multi-perceptorpd-books
classifier-latent-diffusion
common-llm-settings
Common LLM Settings Appbayesian-adam
Exactly what it says on the tinpile-cord19
A script for collecting the CORD-19 dataset in a language modelling friendly format.conceptual-constraints
Applying LEACE to models during trainingngrams-across-time
steering-llama3
truncated-gaussian
Method-of-moments estimation and sampling for truncated multivariate Gaussian distributionsLove Open Source and this site? Check out how you can help us