There are no reviews yet. Be the first to send feedback to the community and the maintainers!
SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It solves 12.47% of bugs in the SWE-bench evaluation set and takes just 1 minute to run.tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language ModelsSimCSE
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333PURE
[NAACL 2021] A Frustratingly Easy Approach for Entity and Relation Extraction https://arxiv.org/abs/2010.12812LM-BFF
[ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723DensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624SimPO
SimPO: Simple Preference Optimization with a Reference-Free RewardLLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured PruningALCE
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction TuningAutoCompressors
[EMNLP 2023] Adapting Language Models to Compress Long ContextsWebShop
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language AgentsTRIME
[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408intercode
[NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898OptiPrompt
[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240TransformerPrograms
[NeurIPS 2023] Learning Transformer ProgramsEntityQuestions
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers https://arxiv.org/abs/2109.08535QuRating
[ICML 2024] Selecting High-Quality Data for Training Language ModelsCEPE
[ACL 2024] Long-Context Language Modeling with Parallel EncodingsDinkyTrain
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration 🚃LLMBar
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction FollowingMQuAKE
[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop QuestionsUSACO
Can Language Models Solve Olympiad Programming?NLProofS
EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443MADE
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-AnsweringLM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643calm-textgame
[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based GamesCharXiv
CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMsc-sts
[EMNLP 2023] C-STS: Conditional Semantic Textual SimilarityDataMUX
[NeurIPS 2022] DataMUX: Data Multiplexing for Neural NetworksShortcutGrammar
EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560LitSearch
A Retrieval Benchmark for Scientific Literature SearchCollie
[ICLR 2024] COLLIE: Systematic Construction of Constrained Text Generation TasksEvalConvQA
[ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question AnsweringMABEL
EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975LM-Science-Tutor
rationale-robustness
NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790PTP
Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073InstructEval
[NAACL 2024 Findings] Evaluation suite for the systematic evaluation of instruction selection methods.WhatICLLearns
[ACL 2023 Findings] What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task LearningCognac
Repo for paper: Controllable Text Generation with Language Constraintscorpus-poisoning
[EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156semsup
Semantic Supervision: Enabling Generalization over Output SpacesELIZA-Transformer
Representing Rule-based Chatbots with TransformersSRL-NLC
Safe Reinforcement Learning with Natural Language ConstraintsEdge-Pruning
Code and data for the paper "Finding Transformer Circuits with Edge Pruning".datamux-pretraining
MUX-PLMs: Pretraining LMs with Data MultiplexingXTX
[ICLR 2022 Spotlight] Multi-Stage Episodic Control for Strategic Exploration in Text GamesMultilingualAnalysis
Repository for the paper titled: "When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer"blindfold-textgame
[NAACL 2021] Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agentsalign-mlm
dyck-transformer
[ACL 2021] Self-Attention Networks Can Process Bounded Hierarchical Languagesmetric-wsd
NAACL'2021: Non-Parametric Few-Shot Learning for Word Sense Disambiguationsemsup-xc
SemSup-XC: Semantic Supervision for Extreme Classificationlwm
We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effectively control these agents through verbal communication.benign-data-breaks-safety
CopyCat
Heuristic-Core
[ACL 2024] The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models - https://arxiv.org/abs/2403.03942CARETS
attribute-tagging
[LaReL 2022] Towards an Enhanced, Faithful, and Adaptable Web Interaction EnvironmentNegotiationToM
Code release for Improving Dialog Systems for Negotiation with Personality Modeling.il-scaling-in-games
Official code repo of "Scaling Laws for Imitation Learning in NetHack"MoQA
Love Open Source and this site? Check out how you can help us