There are no reviews yet. Be the first to send feedback to the community and the maintainers!
SWE-agent
SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges. [NeurIPS 2024]tree-of-thought-llm
[NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language ModelsSimCSE
[EMNLP 2021] SimCSE: Simple Contrastive Learning of Sentence Embeddings https://arxiv.org/abs/2104.08821SWE-bench
[ICLR 2024] SWE-Bench: Can Language Models Resolve Real-world Github Issues?MeZO
[NeurIPS 2023] MeZO: Fine-Tuning Language Models with Just Forward Passes. https://arxiv.org/abs/2305.17333PURE
[NAACL 2021] A Frustratingly Easy Approach for Entity and Relation Extraction https://arxiv.org/abs/2010.12812LM-BFF
[ACL 2021] LM-BFF: Better Few-shot Fine-tuning of Language Models https://arxiv.org/abs/2012.15723SimPO
SimPO: Simple Preference Optimization with a Reference-Free RewardDensePhrases
[ACL 2021] Learning Dense Representations of Phrases at Scale; EMNLP'2021: Phrase Retrieval Learns Passage Retrieval, Too https://arxiv.org/abs/2012.12624LLM-Shearing
[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured PruningALCE
[EMNLP 2023] Enabling Large Language Models to Generate Text with Citations. Paper: https://arxiv.org/abs/2305.14627LESS
[ICML 2024] LESS: Selecting Influential Data for Targeted Instruction TuningAutoCompressors
[EMNLP 2023] Adapting Language Models to Compress Long ContextsWebShop
[NeurIPS 2022] ๐WebShop: Towards Scalable Real-World Web Interaction with Grounded Language AgentsTRIME
[EMNLP 2022] Training Language Models with Memory Augmentation https://arxiv.org/abs/2205.12674intercode
[NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898CoFiPruning
[ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408OptiPrompt
[NAACL 2021] Factual Probing Is [MASK]: Learning vs. Learning to Recall https://arxiv.org/abs/2104.05240TransformerPrograms
[NeurIPS 2023] Learning Transformer ProgramsEntityQuestions
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers https://arxiv.org/abs/2109.08535QuRating
[ICML 2024] Selecting High-Quality Data for Training Language ModelsCEPE
[ACL 2024] Long-Context Language Modeling with Parallel EncodingsDinkyTrain
Princeton NLP's pre-training library based on fairseq with DeepSpeed kernel integration ๐LLMBar
[ICLR 2024] Evaluating Large Language Models at Evaluating Instruction FollowingMQuAKE
[EMNLP 2023] MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop QuestionsUSACO
Can Language Models Solve Olympiad Programming?ProLong
Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"NLProofS
EMNLP 2022: Generating Natural Language Proofs with Verifier-Guided Search https://arxiv.org/abs/2205.12443CharXiv
[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMsMADE
EMNLP 2021: Single-dataset Experts for Multi-dataset Question-AnsweringLM-Kernel-FT
A Kernel-Based View of Language Model Fine-Tuning https://arxiv.org/abs/2210.05643c-sts
[EMNLP 2023] C-STS: Conditional Semantic Textual SimilarityDataMUX
[NeurIPS 2022] DataMUX: Data Multiplexing for Neural NetworksShortcutGrammar
EMNLP 2022: Finding Dataset Shortcuts with Grammar Induction https://arxiv.org/abs/2210.11560LitSearch
A Retrieval Benchmark for Scientific Literature SearchCollie
[ICLR 2024] COLLIE: Systematic Construction of Constrained Text Generation TasksEvalConvQA
[ACL 2022] Ditch the Gold Standard: Re-evaluating Conversational Question AnsweringHELMET
The HELMET BenchmarkMABEL
EMNLP 2022: "MABEL: Attenuating Gender Bias using Textual Entailment Data" https://arxiv.org/abs/2210.14975LM-Science-Tutor
rationale-robustness
NAACL 2022: Can Rationalization Improve Robustness? https://arxiv.org/abs/2204.11790PTP
Improving Language Understanding from Screenshots. Paper: https://arxiv.org/abs/2402.14073corpus-poisoning
[EMNLP 2023] Poisoning Retrieval Corpora by Injecting Adversarial Passages https://arxiv.org/abs/2310.19156InstructEval
[NAACL 2024 Findings] Evaluation suite for the systematic evaluation of instruction selection methods.Edge-Pruning
Code and data for the paper "Finding Transformer Circuits with Edge Pruning".WhatICLLearns
[ACL 2023 Findings] What In-Context Learning โLearnsโ In-Context: Disentangling Task Recognition and Task LearningCognac
Repo for paper: Controllable Text Generation with Language Constraintslwm
We develop world models that can be adapted with natural language. Intergrating these models into artificial agents allows humans to effectively control these agents through verbal communication.ELIZA-Transformer
Representing Rule-based Chatbots with Transformerssemsup
Semantic Supervision: Enabling Generalization over Output Spacesbenign-data-breaks-safety
SRL-NLC
Safe Reinforcement Learning with Natural Language Constraintsdatamux-pretraining
MUX-PLMs: Pretraining LMs with Data MultiplexingXTX
[ICLR 2022 Spotlight] Multi-Stage Episodic Control for Strategic Exploration in Text GamesMultilingualAnalysis
Repository for the paper titled: "When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer"dyck-transformer
[ACL 2021] Self-Attention Networks Can Process Bounded Hierarchical Languagesblindfold-textgame
[NAACL 2021] Reading and Acting while Blindfolded: The Need for Semantics in Text Game Agentsalign-mlm
metric-wsd
NAACL'2021: Non-Parametric Few-Shot Learning for Word Sense Disambiguationsemsup-xc
SemSup-XC: Semantic Supervision for Extreme ClassificationHeuristic-Core
[ACL 2024] The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models - https://arxiv.org/abs/2403.03942CopyCat
NegotiationToM
Code release for Improving Dialog Systems for Negotiation with Personality Modeling.CARETS
SPARTAN
SPARTAN: Sparse Hierarchical Memory for Parameter-Efficient Transformersil-scaling-in-games
Official code repo of "Scaling Laws for Imitation Learning in Single-Agent Games"attribute-tagging
[LaReL 2022] Towards an Enhanced, Faithful, and Adaptable Web Interaction EnvironmentMoQA
Love Open Source and this site? Check out how you can help us