Awesome Transformer Architecture Search:
To keep track of the large number of recent papers that look at the intersection of Transformers and Neural Architecture Search (NAS), we have created this awesome list of curated papers and resources, inspired by awesome-autodl, awesome-architecture-search, and awesome-computer-vision. Papers are divided into the following categories:
- General Transformer search
- Domain Specific, applied Transformer search (divided into NLP, Vision, ASR)
- Transformers Knowledge: Insights / Searchable parameters / Attention
- Transformer Surveys
- Foundation Models
- Misc Resources
This repository is maintained by Yash Mehta, please feel free to reach out, create pull requests or open an issue to add papers. Please see this Google Doc for a comprehensive list of papers at ICML 2023 on foundation models/large language models.
General Transformer Search
Domain Specific Transformer Search
Vision
Title | Venue | Group |
---|---|---|
𝛼NAS: Neural Architecture Search using Property Guided Synthesis | ACM Programming Languages'22 | MIT, Google |
NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training | ICLR'22 | Meta Reality Labs |
AutoFormer: Searching Transformers for Visual Recognition | ICCV'21 | MSR |
GLiT: Neural Architecture Search for Global and Local Image Transformer | ICCV'21 | University of Sydney |
Searching for Efficient Multi-Stage Vision Transformers | ICCV'21 workshop | MIT |
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers | CVPR'21 | Bytedance Inc. |
Natural Language Processing
Title | Venue | Group |
---|---|---|
AutoBERT-Zero: Evolving the BERT backbone from scratch | AAAI'22 | Huawei Noah’s Ark Lab |
Primer: Searching for Efficient Transformers for Language Modeling | NeurIPS'21 | |
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models | ACL'21 | Tsinghua, Huawei Naoh's Ark |
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search | KDD'21 | MSR, Tsinghua University |
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing | ACL'20 | MIT |
Automatic Speech Recognition
Title | Venue | Group |
---|---|---|
SFA: Searching faster architectures for end-to-end automatic speech recognition models | Computer Speech and Language'23 | Chinese Academy of Sciences |
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search | ICASSP'21 | MSR |
Efficient Gradient-Based Neural Architecture Search For End-to-End ASR | ICMI-MLMI'21 | NPU, Xi'an |
Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition | INTERSPEECH'20 | VUNO Inc. |
Transformers Knowledge: Insights, Searchable parameters, Attention
Transformer Surveys
Title | Venue | Group |
---|---|---|
Transformers in Vision: A Survey | ACM Computing Surveys'22 | MBZ University of AI |
A Survey of Vision Transformers | TPAMI'22 | CAS |
Efficient Transformers: A Survey | ACM Computing Surveys'22 | Google Research |
Neural Architecture Search for Transformers: A Survey | IEEE xplore [Sep'22] | Iowa State Uni |
Foundation Models
Title | Venue | Group |
---|---|---|
Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models | arxiv'23 | Amazon Alexa AI |