Focusing on: understanding the internal mechanism of large language models (LLM).
(keep updating when I read good papers ...)
A Comprehensive Overview of Large Language Models. [pdf] [2023.12] [LLM]
A Survey of Large Language Models. [pdf] [2023.11] [LLM]
Explainability for Large Language Models: A Survey. [pdf] [2023.11] [interpretability]
A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future. [pdf] [2023.10] [chain of thought]
Instruction tuning for large language models: A survey. [pdf] [2023.10] [instruction tuning]
Sirenโs Song in the AI Ocean: A Survey on Hallucination in Large Language Models. [pdf] [2023.9] [hallucination]
Reasoning with language model prompting: A survey. [pdf] [2023.9] [reasoning]
Toward Transparent AI: A Survey on Interpreting the Inner Structures of Deep Neural Networks. [pdf] [2023.8] [interpretability]
A Survey on In-context Learning. [pdf] [2023.6] [in-context learning]
Scaling Down to Scale Up: A Guide to Parameter-Efficient Fine-Tuning. [pdf] [2023.3] [parameter-efficient fine-tuning]
Successor Heads: Recurring, Interpretable Attention Heads In The Wild. [pdf] [ICLR 2024 poster] [2023.12]
Impact of Co-occurrence on Factual Knowledge of Large Language Models. [pdf] [EMNLP 2023 findings] [2023.10]
Can Large Language Models Explain Themselves? [pdf] [2023.10]
Neurons in Large Language Models: Dead, N-gram, Positional. [pdf] [2023.9]
Do Machine Learning Models Memorize or Generalize? [blog] [2023.8]
Overthinking the Truth: Understanding how Language Models Process False Demonstrations. [pdf] [2023.7]
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning. [pdf] [EMNLP 2023 best paper] [2023.5]
Let's Verify Step by Step. [pdf] [ICLR 2024 poster] [2023.5]
What In-Context Learning "Learns" In-Context: Disentangling Task Recognition and Task Learning. [pdf] [ACL 2023 findings] [2023.5]
Language models can explain neurons in language models. [blog] [2023.5]
Dissecting Recall of Factual Associations in Auto-Regressive Language Models. [pdf] [EMNLP 2023 main] [2023.4]
Are Emergent Abilities of Large Language Models a Mirage? [pdf] [NeurIPS 2023 best paper] [2023.4]
The Closeness of In-Context Learning and Weight Shifting for Softmax Regression. [pdf] [2023.4]
How does GPT-2 compute greater-than?: Interpreting mathematical abilities in a pre-trained language model. [pdf] [NeurIPS 2023 poster] [2023.4]
A Theory of Emergent In-Context Learning as Implicit Structure Induction. [pdf] [2023.3]
Larger language models do in-context learning differently. [pdf] [2023.3]
Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models. [pdf] [NeurIPs 2023 spotlight] [2023.1]
Transformers as Algorithms: Generalization and Stability in In-context Learning. [pdf] [ICML 2023 poster] [2023.1]
Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers. [pdf] [ACL 2023 findings] [2022.12]
How does gpt obtain its ability? tracing emergent abilities of language models to their sources. [blog] [2022.12]
Towards Understanding Chain-of-Thought Prompting: An Empirical Study of What Matters. [pdf] [ACL 2023 long] [2022.12]
Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small. [pdf] [ICLR 2023 poster] [2022.11]
Inverse scaling can become U-shaped. [pdf] [EMNLP 2023 main] [2022.11]
What learning algorithm is in-context learning? Investigations with linear models. [pdf] [ICLR 2023 notable] [2022.11]
Mass-Editing Memory in a Transformer. [pdf] [ICLR 2023 notable] [2022.10]
Polysemanticity and Capacity in Neural Networks. [pdf] [2022.10]
Analyzing Transformers in Embedding Space. [pdf] [ACL 2023 long] [2022.9]
Toy Models of Superposition. [blog] [2022.9]
Text and Patterns: For Effective Chain of Thought, It Takes Two to Tango. [pdf] [2022.9]
Emergent Abilities of Large Language Models. [pdf] [2022.6]
Mechanistic Interpretability, Variables, and the Importance of Interpretable Bases. [blog] [2022.6]
Towards Tracing Factual Knowledge in Language Models Back to the Training Data. [pdf] [EMNLP 2022 findings] [2022.5]
Ground-Truth Labels Matter: A Deeper Look into Input-Label Demonstrations. [pdf] [EMNLP 2022 main] [2022.5]
Large Language Models are Zero-Shot Reasoners. [pdf] [NeurIPS 2022] [2022.5]
Scaling Laws and Interpretability of Learning from Repeated Data. [pdf] [2022.5]
Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space. [pdf] [EMNLP 2022 main] [2022.3]
In-context Learning and Induction Heads. [blog] [2022.3]
Locating and Editing Factual Associations in GPT. [pdf] [NeurIPS 2022] [2022.2]
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? [pdf] [EMNLP 2022 main] [2022.2]
Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets. [pdf] [2022.1]
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. [pdf] [2022.1]
A Mathematical Framework for Transformer Circuits. [blog] [2021.12]
An Explanation of In-context Learning as Implicit Bayesian Inference. [pdf] [ICLR 2022 poster] [2021.11]
Towards a Unified View of Parameter-Efficient Transfer Learning. [pdf] [ICLR 2022 spotlight] [2021.10]
Do Prompt-Based Models Really Understand the Meaning of their Prompts? [pdf] [NAACL 2022] [2021.9]
Deduplicating Training Data Makes Language Models Better. [pdf] [ACL 2022 long] [2021.7]
LoRA: Low-Rank Adaptation of Large Language Models. [pdf] [ICLR 2022 poster] [2021.6]
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity. [pdf] [ACL 2022 long] [2021.4]
The Power of Scale for Parameter-Efficient Prompt Tuning. [pdf] [EMNLP 2021 main] [2021.4]
Calibrate Before Use: Improving Few-Shot Performance of Language Models [pdf] [ICML 2021] [2021.2]
Prefix-Tuning: Optimizing Continuous Prompts for Generation. [pdf] [ACL 2021 long] [2021.1]
Transformer Feed-Forward Layers Are Key-Value Memories. [pdf] [EMNLP 2021 main] [2020.12]
Scaling Laws for Neural Language Models. [pdf] [2020.1]