• Stars
    star
    252
  • Rank 156,585 (Top 4 %)
  • Language
  • Created over 2 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A curated list of awesome resources combining Transformers with Neural Architecture Search

Awesome Transformer Architecture Search: Awesome

To keep track of the large number of recent papers that look at the intersection of Transformers and Neural Architecture Search (NAS), we have created this awesome list of curated papers and resources, inspired by awesome-autodl, awesome-architecture-search, and awesome-computer-vision. Papers are divided into the following categories:

  1. General Transformer search
  2. Domain Specific, applied Transformer search (divided into NLP, Vision, ASR)
  3. Transformers Knowledge: Insights / Searchable parameters / Attention
  4. Transformer Surveys
  5. Foundation Models
  6. Misc Resources

This repository is maintained by Yash Mehta, please feel free to reach out, create pull requests or open an issue to add papers. Please see this Google Doc for a comprehensive list of papers at ICML 2023 on foundation models/large language models.

General Transformer Search

Title Venue Group
Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models NeurIPS'22 MSR
Training Free Transformer Architecture Search CVPR'22 Tencent & Xiamen University
LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models AutoML Conference 2022 Workshop Track MSR
Searching the Search Space of Vision Transformer NeurIPS'21 MSRA, Stony Brook University
UniNet: Unified Architecture Search with Convolutions, Transformer and MLP ECCV'22 SenseTime
Analyzing and Mitigating Interference in Neural Architecture Search ICML'22 Tsinghua, MSR
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search ICCV'21 Sun Yat-sen University
Memory-Efficient Differentiable Transformer Architecture Search ACL-IJCNLP'21 MSR, Peking University
Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition arxiv [Aug'20] Google Research
AutoTrans: Automating Transformer Design via Reinforced Architecture Search NLPCC'21 Fudan University
NASABN: A Neural Architecture Search Framework for Attention-Based Networks IJCNN'20 Chinese Academy of Sciences
NAT: Neural Architecture Transformer for Accurate and Compact Architectures NeurIPS'19 Tencent AI
The Evolved Transformer ICML'19 Google Brain

Domain Specific Transformer Search

Vision

Title Venue Group
𝛼NAS: Neural Architecture Search using Property Guided Synthesis ACM Programming Languages'22 MIT, Google
NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training ICLR'22 Meta Reality Labs
AutoFormer: Searching Transformers for Visual Recognition ICCV'21 MSR
GLiT: Neural Architecture Search for Global and Local Image Transformer ICCV'21 University of Sydney
Searching for Efficient Multi-Stage Vision Transformers ICCV'21 workshop MIT
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers CVPR'21 Bytedance Inc.

Natural Language Processing

Title Venue Group
AutoBERT-Zero: Evolving the BERT backbone from scratch AAAI'22 Huawei Noah’s Ark Lab
Primer: Searching for Efficient Transformers for Language Modeling NeurIPS'21 Google
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models ACL'21 Tsinghua, Huawei Naoh's Ark
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search KDD'21 MSR, Tsinghua University
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing ACL'20 MIT

Automatic Speech Recognition

Title Venue Group
SFA: Searching faster architectures for end-to-end automatic speech recognition models Computer Speech and Language'23 Chinese Academy of Sciences
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search ICASSP'21 MSR
Efficient Gradient-Based Neural Architecture Search For End-to-End ASR ICMI-MLMI'21 NPU, Xi'an
Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition INTERSPEECH'20 VUNO Inc.

Transformers Knowledge: Insights, Searchable parameters, Attention

Title Venue Group
RWKV: Reinventing RNNs for the Transformer Era arxiv [May'23] EleutherAI
Patches are All You Need ? TMLR'23 CMU
Seperable Self Attention for Mobile Vision Transformers TMLR'23 Apple
Parameter-efficient Fine-tuning for Vision Transformers AAAI'23 MSR & UCSC
EfficientFormer: Vision Transformers at MobileNet Speed NeurIPS'22 Snap Inc
Neighborhood Attention Transformer CVPR'23 Meta AI
Training Compute Optimal Large Language Models NeurIPS'22 DeepMind
CMT: Convolutional Neural Networks meet Vision Transformers CVPR'22 Huawei Noah’s Ark Lab
Patch Slimming for Efficient Vision Transformers CVPR'22 Huawei Noah’s Ark Lab
Lite Vision Transformer with Enhanced Self-Attention CVPR'22 Johns Hopkins University, Adobe
TubeDETR: Spatio-Temporal Video Grounding with Transformers CVPR'22 (Oral) CNRS & Inria
Beyond Fixation: Dynamic Window Visual Transformer CVPR'22 UT Sydney & RMIT University
BEiT: BERT Pre-Training of Image Transformers ICLR'22 (Oral) MSR
How Do Vision Transformers Work? ICLR'22 (Spotlight) NAVER AI
Scale Efficiently: Insights from Pretraining and FineTuning Transformers ICLR'22 Google Research
Tuformer: Data-Driven Design of Expressive Transformer by Tucker Tensor Representation ICLR'22 UoMaryland
DictFormer: Tiny Transformer with Shared Dictionary ICLR'22 Samsung Research
QuadTree Attention for Vision Transformers ICLR'22 Alibaba AI Lab
Expediting Vision Transformers via Token Reorganization ICLR'22 (Spotlight) UC San Diego & Tencent AI Lab
UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning ICLR'22 SIAT-SenseTime
Hierarchical Transformers Are More Efficient Language Models NAACL'22 Google Research, UoWarsaw
Transformer in Transformer NeurIPS'21 Huawei Noah's Ark
Long-Short Transformer: Efficient Transformers for Language and Vision NeurIPS'21 NVIDIA
Memory-efficient Transformers via Top-k Attention EMNLP Workshop '21 Allen AI
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows ICCV'21 best paper MSR
Rethinking Spatial Dimensions of Vision Transformers ICCV'21 NAVER AI
What makes for hierarchical vision transformers arxiv [Sept'21] HUST
AutoAttend: Automated Attention Representation Search ICML'21 Tsinghua University
Rethinking Attention with Performers ICLR'21 Oral Google
LambdaNetworks: Modeling long-range Interactions without Attention ICLR'21 Google Research
HyperGrid Transformers ICLR'21 Google Research
LocalViT: Bringing Locality to Vision Transformers arxiv [April'21] ETH Zurich
Compressive Transformers for Long Range Sequence Modelling ICLR'20 DeepMind
Improving Transformer Models by Reordering their Sublayers ACL'20 FAIR, Allen AI
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned ACL'19 Yandex

Transformer Surveys

Title Venue Group
Transformers in Vision: A Survey ACM Computing Surveys'22 MBZ University of AI
A Survey of Vision Transformers TPAMI'22 CAS
Efficient Transformers: A Survey ACM Computing Surveys'22 Google Research
Neural Architecture Search for Transformers: A Survey IEEE xplore [Sep'22] Iowa State Uni

Foundation Models

Title Venue Group
Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models arxiv'23 Amazon Alexa AI

Misc resources

More Repositories

1

auto-sklearn

Automated Machine Learning with scikit-learn
Python
7,389
star
2

Auto-PyTorch

Automatic architecture search and hyperparameter optimization for PyTorch
Python
2,271
star
3

TabPFN

Official implementation of the TabPFN paper (https://arxiv.org/abs/2207.01848) and the tabpfn package.
Python
1,102
star
4

SMAC3

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization
Python
1,003
star
5

HpBandSter

a distributed Hyperband implementation on Steroids
Python
603
star
6

NASLib

NASLib is a Neural Architecture Search (NAS) library for facilitating NAS research for the community by providing interfaces to several state-of-the-art NAS search spaces and optimizers.
Python
497
star
7

RoBO

RoBO: a Robust Bayesian Optimization framework
Python
479
star
8

autoweka

Auto-WEKA
Java
326
star
9

ConfigSpace

Domain specific language for configuration spaces in Python/Cython. Useful for hyperparameter optimization and algorithm configuration.
Python
186
star
10

HPOlib

HPOlib is a hyperparameter optimization library. It provides a common interface to three state of the art hyperparameter optimization packages: SMAC, spearmint and hyperopt. This package is discontinued, please read the longer note in the info box below.
Python
167
star
11

TransformersCanDoBayesianInference

Official Implementation of "Transformers Can Do Bayesian Inference", the PFN paper
Python
162
star
12

RobustDARTS

Understanding and Robustifying DARTS
Python
153
star
13

trivialaugment

This is the official implementation of TrivialAugment and a mini-library for the application of multiple image augmentation strategies including RandAugment and TrivialAugment.
Python
137
star
14

pybnn

Bayesian neural network package
Jupyter Notebook
131
star
15

HPOBench

Collection of hyperparameter optimization benchmark problems
Python
125
star
16

CARL

Benchmarking RL generalization in an interpretable way.
Python
120
star
17

CAAFE

Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, Müller, and Hutter (2023).
Python
100
star
18

nas_benchmarks

Python
91
star
19

ParameterImportance

Parameter Importance Analysis Tool
Python
75
star
20

nasbench301

Python
71
star
21

HPOlib1.5

Python
70
star
22

nasbench-1shot1

Python
68
star
23

BOAH

BOAH: Bayesian Optimization & Analysis of Hyperparameters
Python
68
star
24

DEHB

Python
66
star
25

DeepCAVE

An interactive framework to visualize and analyze your AutoML process in real-time.
Python
63
star
26

labwatch

An extension to Sacred for automated hyperparameter optimization.
Python
60
star
27

learna

End-to-end RNA Design using deep reinforcement learning
Python
55
star
28

amltk

A build-it-yourself AutoML Framework
Python
53
star
29

CAVE

[deprecated] Configuration Assessment, Visualization and Evaluation
Python
45
star
30

zero-shot-automl-with-pretrained-models

Official repository for the paper "Zero-Shot AutoML with Pretrained Models"
Python
41
star
31

random_forest_run

C++
36
star
32

neps

Neural Pipeline Search (NePS): Helps deep learning experts find the best neural pipeline.
Python
35
star
33

SEARL

Sample-Efficient Automated Deep Reinforcement Learning
Python
34
star
34

AutoFolio

Automated Algorithm Selection with Hyperparameter Optimization
Python
34
star
35

LCBench

A learning curve benchmark on OpenML data
Jupyter Notebook
29
star
36

nes

Neural Ensemble Search for Uncertainty Estimation and Dataset Shift
Python
29
star
37

DACBench

A benchmark library for Dynamic Algorithm Configuration.
PDDL
26
star
38

mdp-playground

A python package to design and debug RL agents.
Python
24
star
39

PFNs

Our maintained PFN repository. Come here to train SOTA PFNs.
Python
23
star
40

multi-obj-baselines

Python
22
star
41

RNAformer

Scalable Deep Learning for RNA Secondary Structure Prediction
Python
22
star
42

auto-sklearn-talks

Presentations on Auto-sklearn
Jupyter Notebook
22
star
43

learning_environments

Python
20
star
44

DAC

Dynamic Algorithm Configuration
Jupyter Notebook
20
star
45

DE-NAS

Jupyter Notebook
19
star
46

nas-bench-x11

Python
18
star
47

pynisher

Python
18
star
48

PFNs4BO

The official implementation of PFNs4BO: In-Context Learning for Bayesian Optimization
Jupyter Notebook
16
star
49

ProbTransformer

Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design
Python
16
star
50

TempoRL

Python
15
star
51

jahs_bench_201

The first collection of surrogate benchmarks for Joint Architecture and Hyperparameter Search.
Python
15
star
52

tabpfn-client

Python
14
star
53

HPO_for_RL

This is the code of reproducing the results of our paper: On the importance of Hyperparameter Optimization for Model-based Reinforcement Learning
Python
14
star
54

Squirrel-Optimizer-BBO-NeurIPS20-automlorg

Python
13
star
55

hierarchical_nas_construction

Official repository for "Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars" (NeurIPS 2023)
Python
12
star
56

GenericWrapper4AC

C++
12
star
57

SVGe

Smooth Variational Graph Embeddings for Efficient Neural Architecture Search
Python
12
star
58

ChaLearn_Automatic_Machine_Learning_Challenge_2015

Python
11
star
59

transfer-hpo-framework

Code accompanying https://arxiv.org/abs/1802.02219
Python
11
star
60

ASKL2.0_experiments

Jupyter Notebook
9
star
61

EfficientNAS

Python
9
star
62

TabularTempoRL

Code for the paper "Towards TempoRL: Learning When to Act"
Python
8
star
63

LTO-CMA

Code for the paper "Learning Step-Size Adaptation in CMA-ES"
Python
8
star
64

HPOlibConfigSpace

Python
8
star
65

paramsklearn

Python
8
star
66

multibeep

A Multi Armed Bandit library written in C++ with Python bindings
C
8
star
67

HPOBenchExperimentUtils

Experiment code to run large-scale experimente with HPOBench
Python
7
star
68

lcpfn

Python
7
star
69

mf-prior-bench

A collection of multi-fidelity benchmarks with first class support for user priors
Python
6
star
70

HPOlib-hpconvnet

A wrapper for James Bergstras hyperopt convnet
Python
5
star
71

dac4automlcomp

DAC4AutoML Competition
HTML
5
star
72

automl_common

This repository holds shared utilities that AutoML frameworks may benefit from.
Python
5
star
73

IMFAS

Implicit Multi-Fidelity Algorithm Selection
Python
5
star
74

ParameterConfigSpace

parameter configuration space parser for SMAC format
Python
5
star
75

DAC4SGD

Python
5
star
76

automl_template

A template that provides all the tools to ensure the same project setup across all AutoML packages.
Python
5
star
77

AutoRL-Landscape

Python
4
star
78

SAWEI

Jupyter Notebook
4
star
79

SPaCE

Jupyter Notebook
4
star
80

masif

MASIF: Meta-learned Algorithm Selection using Implicit Fidelity Information
Python
4
star
81

hydra-smac-sweeper

Sweeper plugin based on SMAC for Hydra.
Python
4
star
82

DAC4RL

DAC4RL track of DAC4AutoML competition at AutoML Conf
Python
4
star
83

BO-AFS

For BO: Select Acquisition Function (Schedule) with Meta-Learned Model Per-Run
Jupyter Notebook
4
star
84

HPOlib-AutoWEKA

Python
3
star
85

AutoDLComp19

AutoDL Competition Scripts 2019
Python
3
star
86

mf-prior-exp

Python
3
star
87

ICGen

Image Classification Dataset Generator
Python
3
star
88

2022_JAIR_DAC_experiments

Python
2
star
89

HPOlib-hpnnet

Python
2
star
90

SPaCE_BIG

Code for the experiments in "Towards Self-Paced Context Evaluation for Contextual Reinforcement Learning"
Python
2
star
91

plotting_scripts

Python
2
star
92

DontWasteYourTime-early-stopping

Experiments for pipelines
Python
2
star
93

pi_is_back

Repo for "PI is back! Switching Acquisition Functions in Bayesian Optimization" (NeurIPS: Gaussian Process Workshop '22)
Python
2
star
94

automl_sphinx_theme

Write easy documentations with the AutoML sphinx theme. No sphinx knowledge necessary. See the documentation to get a preview:
Python
2
star
95

SAFS

Respository for Sparse Activation Function Search
Python
2
star
96

bibtex-cleaner

Python
2
star
97

AutomlCup2023

Code for the AutoMLCup 2023
Python
2
star
98

naslib-fall-school

Repository for the NASLib Hands-on Session at the AutoML Fall School 2022
2
star
99

hydra_tutorial

AutoML Fall School 23
Jupyter Notebook
2
star
100

autorl-org

The AutoRL.org site
Ruby
2
star