There are no reviews yet. Be the first to send feedback to the community and the maintainers!
gpt-neo
An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.lm-evaluation-harness
A framework for few-shot evaluation of language models.pythia
The hub for EleutherAI's work on interpretability and learning dynamicsthe-pile
math-lm
polyglot
Polyglot: Large Language Models of Well-balanced Competence in Multi-languagesDALLE-mtf
Open-AI's DALL-E for large scale training in mesh-tensorflow.vqgan-clip
concept-erasure
Erasing concepts from neural representations with provable guaranteeselk
Keeping language models honest by directly eliciting knowledge encoded in their activations.oslo
OSLO: Open Source for Large-scale Optimizationlm_perplexity
knowledge-neurons
A library for finding knowledge neurons in pretrained transformer models.cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.pyfra
Python Research Frameworkdps
Data processing system for polyglotopenwebtext2
info
(Deprecated) A hub for onboarding & other information.project-menu
See the issue board for the current status of active and prospective projects!stackexchange-dataset
Python tools for processing the stackexchange data dumps into a text dataset for Language ModelsDeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.magiCARP
One stop shop for all things carptqdm-multiprocess
Using queues, tqdm-multiprocess supports multiple worker processes, each with multiple tqdm progress bars, displaying them cleanly through the main process. It offers similar functionality for python logging.aria
semantic-memorization
hae-rae
improved-t5
Experiments for efforts to train a new and improved t5features-across-time
Understanding how features learned by neural networks evolve throughout trainingmp_nerf
Massively-Parallel Natural Extension of Reference Framepile-pubmedcentral
A script for collecting the PubMed Central dataset in a language modelling friendly format.best-download
URL downloader supporting checkpointing and continuous checksumming.polyglot-data
data related codebase for polyglot projectelk-generalization
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hardtext-generation-testing-ui
Web app for demoing the EAI modelsexploring-contrastive-topology
rnngineering
Engineering the state of RNN language models (Mamba, RWKV, etc.)mdl
Minimum Description Length probing for neural network representationspile_dedupe
Pile Deduplication Codepilev2
distilling
Experiments with distilling large language models.lm-eval2
equivariance
A framework for implementing equivariant DLradioactive-lab
Adapting the "Radioactive Data" paper to work for text modelstagged-pile
Part-of-Speech Tagging for the Pile and RedPajamapile-literotica
Download, parse, and filter data from Literotica. Data-ready for The-Pile.hn-scraper
multimodal-fid
pile-cc-filtering
The code used to filter CC data for The Pileminetest-baselines
Baseline agents for Minetest tasks.CodeCARP
Data collection pipeline for CodeCARP. Includes PyCharm plugins.LLM-Markov-Chains
Project github for LLM Markov Chains Projectpile-uspto
A script for collecting the USPTO Backgrounds dataset in a language modelling friendly format.thonkenizers
yesminetest-interpretabilty-notebook
Jupyter notebook for the interpretablity section of the minetester blog postvisual-grounding
Visually ground GPT-Neo 1.3b and 2.7bUnpaired-Image-Generation
Project Repo for Unpaired Image Generation projectpile-enron-emails
A script for collecting the Enron Emails dataset in a language modelling friendly format.architecture-experiments
Repository to host architecture experiments and development using Paxml and Praxisllemma-sample-explorer
Sample explorer tool for the Llemma models.pile-explorer
For exploring the data and documenting its limitationslm-scope
megatron-3d
tokengrams
Efficiently computing & storing token n-grams from large corporaccs
latent-video-diffusion
Latent video diffusioneleutherai-instruct-dataset
A large instruct dataset for open-source models (WIP).isaac-mchorse
EleutherAI's discord botpile-allpoetry
Scraper to gather poems from allpoetry.comeai-prompt-gallery
Library of interesting prompt generationseleutherai.github.io
This is the Hugo generated website for eleuther.ai. The source of this build is new-website repo.website
New website for EleutherAI based on Hugo static site generatorvariance-across-time
Studying the variance in neural net predictions across training timepile-ubuntu-irc
A script for collecting the Ubuntu IRC dataset in a language modelling friendly format.aria-amt
MIDI conditioned automatic music transcriptionreddit-comment-processing
language-adaptation
EvilModel
A replication of "EvilModel 2.0: Bringing Neural Network Models into Malware Attacks"bucket-cleaner
A small utility to clear out old model checkpoints in Google Cloud Buckets whilst keeping tensorboard event filesgroupoid-rl
lang-filter
Filter text files or archives by languageeleuther-blog
here is the generated content for the EleutherAI blog. Source is from new-website repoprefix-free-tokenizer
A prefix free tokenizeralignment-reader
Search and filter through alignment literaturegrouch
perceptors
central location for access to pretrained models for CLIP and variants, with common API and out-of-the-box differentiable weighted multi-perceptorclassifier-latent-diffusion
common-llm-settings
Common LLM Settings Appbayesian-adam
Exactly what it says on the tinconceptual-constraints
Applying LEACE to models during trainingtruncated-gaussian
Method-of-moments estimation and sampling for truncated multivariate Gaussian distributionsLove Open Source and this site? Check out how you can help us