• Stars
    star
    254
  • Rank 160,264 (Top 4 %)
  • Language
  • Created about 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A curated list of awesome resources combining Transformers with Neural Architecture Search

Awesome Transformer Architecture Search: Awesome

To keep track of the large number of recent papers that look at the intersection of Transformers and Neural Architecture Search (NAS), we have created this awesome list of curated papers and resources, inspired by awesome-autodl, awesome-architecture-search, and awesome-computer-vision. Papers are divided into the following categories:

  1. General Transformer search
  2. Domain Specific, applied Transformer search (divided into NLP, Vision, ASR)
  3. Transformers Knowledge: Insights / Searchable parameters / Attention
  4. Transformer Surveys
  5. Foundation Models
  6. Misc Resources

This repository is maintained by Yash Mehta, please feel free to reach out, create pull requests or open an issue to add papers. Please see this Google Doc for a comprehensive list of papers at ICML 2023 on foundation models/large language models.

General Transformer Search

Title Venue Group
Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models NeurIPS'22 MSR
Training Free Transformer Architecture Search CVPR'22 Tencent & Xiamen University
LiteTransformerSearch: Training-free On-device Search for Efficient Autoregressive Language Models AutoML Conference 2022 Workshop Track MSR
Searching the Search Space of Vision Transformer NeurIPS'21 MSRA, Stony Brook University
UniNet: Unified Architecture Search with Convolutions, Transformer and MLP ECCV'22 SenseTime
Analyzing and Mitigating Interference in Neural Architecture Search ICML'22 Tsinghua, MSR
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search ICCV'21 Sun Yat-sen University
Memory-Efficient Differentiable Transformer Architecture Search ACL-IJCNLP'21 MSR, Peking University
Finding Fast Transformers: One-Shot Neural Architecture Search by Component Composition arxiv [Aug'20] Google Research
AutoTrans: Automating Transformer Design via Reinforced Architecture Search NLPCC'21 Fudan University
NASABN: A Neural Architecture Search Framework for Attention-Based Networks IJCNN'20 Chinese Academy of Sciences
NAT: Neural Architecture Transformer for Accurate and Compact Architectures NeurIPS'19 Tencent AI
The Evolved Transformer ICML'19 Google Brain

Domain Specific Transformer Search

Vision

Title Venue Group
𝛼NAS: Neural Architecture Search using Property Guided Synthesis ACM Programming Languages'22 MIT, Google
NASViT: Neural Architecture Search for Efficient Vision Transformers with Gradient Conflict aware Supernet Training ICLR'22 Meta Reality Labs
AutoFormer: Searching Transformers for Visual Recognition ICCV'21 MSR
GLiT: Neural Architecture Search for Global and Local Image Transformer ICCV'21 University of Sydney
Searching for Efficient Multi-Stage Vision Transformers ICCV'21 workshop MIT
HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers CVPR'21 Bytedance Inc.

Natural Language Processing

Title Venue Group
AutoBERT-Zero: Evolving the BERT backbone from scratch AAAI'22 Huawei Noah’s Ark Lab
Primer: Searching for Efficient Transformers for Language Modeling NeurIPS'21 Google
AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models ACL'21 Tsinghua, Huawei Naoh's Ark
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search KDD'21 MSR, Tsinghua University
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing ACL'20 MIT

Automatic Speech Recognition

Title Venue Group
SFA: Searching faster architectures for end-to-end automatic speech recognition models Computer Speech and Language'23 Chinese Academy of Sciences
LightSpeech: Lightweight and Fast Text to Speech with Neural Architecture Search ICASSP'21 MSR
Efficient Gradient-Based Neural Architecture Search For End-to-End ASR ICMI-MLMI'21 NPU, Xi'an
Evolved Speech-Transformer: Applying Neural Architecture Search to End-to-End Automatic Speech Recognition INTERSPEECH'20 VUNO Inc.

Transformers Knowledge: Insights, Searchable parameters, Attention

Title Venue Group
RWKV: Reinventing RNNs for the Transformer Era arxiv [May'23] EleutherAI
Patches are All You Need ? TMLR'23 CMU
Seperable Self Attention for Mobile Vision Transformers TMLR'23 Apple
Parameter-efficient Fine-tuning for Vision Transformers AAAI'23 MSR & UCSC
EfficientFormer: Vision Transformers at MobileNet Speed NeurIPS'22 Snap Inc
Neighborhood Attention Transformer CVPR'23 Meta AI
Training Compute Optimal Large Language Models NeurIPS'22 DeepMind
CMT: Convolutional Neural Networks meet Vision Transformers CVPR'22 Huawei Noah’s Ark Lab
Patch Slimming for Efficient Vision Transformers CVPR'22 Huawei Noah’s Ark Lab
Lite Vision Transformer with Enhanced Self-Attention CVPR'22 Johns Hopkins University, Adobe
TubeDETR: Spatio-Temporal Video Grounding with Transformers CVPR'22 (Oral) CNRS & Inria
Beyond Fixation: Dynamic Window Visual Transformer CVPR'22 UT Sydney & RMIT University
BEiT: BERT Pre-Training of Image Transformers ICLR'22 (Oral) MSR
How Do Vision Transformers Work? ICLR'22 (Spotlight) NAVER AI
Scale Efficiently: Insights from Pretraining and FineTuning Transformers ICLR'22 Google Research
Tuformer: Data-Driven Design of Expressive Transformer by Tucker Tensor Representation ICLR'22 UoMaryland
DictFormer: Tiny Transformer with Shared Dictionary ICLR'22 Samsung Research
QuadTree Attention for Vision Transformers ICLR'22 Alibaba AI Lab
Expediting Vision Transformers via Token Reorganization ICLR'22 (Spotlight) UC San Diego & Tencent AI Lab
UniFormer: Unified Transformer for Efficient Spatial-Temporal Representation Learning ICLR'22 SIAT-SenseTime
Hierarchical Transformers Are More Efficient Language Models NAACL'22 Google Research, UoWarsaw
Transformer in Transformer NeurIPS'21 Huawei Noah's Ark
Long-Short Transformer: Efficient Transformers for Language and Vision NeurIPS'21 NVIDIA
Memory-efficient Transformers via Top-k Attention EMNLP Workshop '21 Allen AI
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows ICCV'21 best paper MSR
Rethinking Spatial Dimensions of Vision Transformers ICCV'21 NAVER AI
What makes for hierarchical vision transformers arxiv [Sept'21] HUST
AutoAttend: Automated Attention Representation Search ICML'21 Tsinghua University
Rethinking Attention with Performers ICLR'21 Oral Google
LambdaNetworks: Modeling long-range Interactions without Attention ICLR'21 Google Research
HyperGrid Transformers ICLR'21 Google Research
LocalViT: Bringing Locality to Vision Transformers arxiv [April'21] ETH Zurich
Compressive Transformers for Long Range Sequence Modelling ICLR'20 DeepMind
Improving Transformer Models by Reordering their Sublayers ACL'20 FAIR, Allen AI
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned ACL'19 Yandex

Transformer Surveys

Title Venue Group
Transformers in Vision: A Survey ACM Computing Surveys'22 MBZ University of AI
A Survey of Vision Transformers TPAMI'22 CAS
Efficient Transformers: A Survey ACM Computing Surveys'22 Google Research
Neural Architecture Search for Transformers: A Survey IEEE xplore [Sep'22] Iowa State Uni

Foundation Models

Title Venue Group
Neural Architecture Search for Parameter-Efficient Fine-tuning of Large Pre-trained Language Models arxiv'23 Amazon Alexa AI

Misc resources

More Repositories

1

auto-sklearn

Automated Machine Learning with scikit-learn
Python
7,574
star
2

Auto-PyTorch

Automatic architecture search and hyperparameter optimization for PyTorch
Python
2,358
star
3

TabPFN

Official implementation of the TabPFN paper (https://arxiv.org/abs/2207.01848) and the tabpfn package.
Python
1,187
star
4

SMAC3

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization
Python
1,067
star
5

HpBandSter

a distributed Hyperband implementation on Steroids
Python
609
star
6

NASLib

NASLib is a Neural Architecture Search (NAS) library for facilitating NAS research for the community by providing interfaces to several state-of-the-art NAS search spaces and optimizers.
Python
521
star
7

RoBO

RoBO: a Robust Bayesian Optimization framework
Python
480
star
8

autoweka

Auto-WEKA
Java
330
star
9

ConfigSpace

Domain specific language for configuration spaces in Python. Useful for hyperparameter optimization and algorithm configuration.
Python
202
star
10

TransformersCanDoBayesianInference

Official Implementation of "Transformers Can Do Bayesian Inference", the PFN paper
Python
183
star
11

HPOlib

HPOlib is a hyperparameter optimization library. It provides a common interface to three state of the art hyperparameter optimization packages: SMAC, spearmint and hyperopt. This package is discontinued, please read the longer note in the info box below.
Python
167
star
12

RobustDARTS

Understanding and Robustifying DARTS
Python
153
star
13

trivialaugment

This is the official implementation of TrivialAugment and a mini-library for the application of multiple image augmentation strategies including RandAugment and TrivialAugment.
Python
147
star
14

HPOBench

Collection of hyperparameter optimization benchmark problems
Python
137
star
15

pybnn

Bayesian neural network package
Jupyter Notebook
135
star
16

CARL

Benchmarking RL generalization in an interpretable way.
Python
129
star
17

CAAFE

Semi-automatic feature engineering process using Language Models and your dataset descriptions. Based on the paper "LLMs for Semi-Automated Data Science: Introducing CAAFE for Context-Aware Automated Feature Engineering" by Hollmann, Müller, and Hutter (2023).
Python
124
star
18

nas_benchmarks

Python
91
star
19

ParameterImportance

Parameter Importance Analysis Tool
Python
75
star
20

nasbench301

Python
74
star
21

DEHB

Python
71
star
22

DeepCAVE

An interactive framework to visualize and analyze your AutoML process in real-time.
Python
70
star
23

HPOlib1.5

Python
69
star
24

nasbench-1shot1

Python
67
star
25

BOAH

BOAH: Bayesian Optimization & Analysis of Hyperparameters
Python
67
star
26

amltk

A build-it-yourself AutoML Framework
Python
62
star
27

labwatch

An extension to Sacred for automated hyperparameter optimization.
Python
59
star
28

learna

End-to-end RNA Design using deep reinforcement learning
Python
57
star
29

neps

Neural Pipeline Search (NePS): Helps deep learning experts find the best neural pipeline.
Python
54
star
30

CAVE

[deprecated] Configuration Assessment, Visualization and Evaluation
Python
46
star
31

PFNs

Our maintained PFN repository. Come here to train SOTA PFNs.
Python
44
star
32

zero-shot-automl-with-pretrained-models

Official repository for the paper "Zero-Shot AutoML with Pretrained Models"
Python
41
star
33

random_forest_run

C++
35
star
34

AutoFolio

Automated Algorithm Selection with Hyperparameter Optimization
Python
35
star
35

SEARL

Sample-Efficient Automated Deep Reinforcement Learning
Python
34
star
36

nes

Neural Ensemble Search for Uncertainty Estimation and Dataset Shift
Python
31
star
37

LCBench

A learning curve benchmark on OpenML data
Jupyter Notebook
29
star
38

DACBench

A benchmark library for Dynamic Algorithm Configuration.
PDDL
28
star
39

mdp-playground

A python package to design and debug RL agents.
Python
28
star
40

RNAformer

Scalable Deep Learning for RNA Secondary Structure Prediction
Python
27
star
41

auto-sklearn-talks

Presentations on Auto-sklearn
Jupyter Notebook
24
star
42

multi-obj-baselines

Python
22
star
43

PFNs4BO

The official implementation of PFNs4BO: In-Context Learning for Bayesian Optimization
Jupyter Notebook
22
star
44

CVPR24-MedSAM-on-Laptop

Data-aware Fine-Tuning (DAFT) Code related to the CPVR24 Competition for Image Segmentation on a Laptop.
Python
22
star
45

tabpfn-client

Python
21
star
46

learning_environments

Python
20
star
47

pynisher

Python
20
star
48

DE-NAS

Jupyter Notebook
19
star
49

DAC

Dynamic Algorithm Configuration
Jupyter Notebook
19
star
50

nas-bench-x11

Python
18
star
51

ProbTransformer

Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design
Python
17
star
52

Mighty

The Mighty cRL library you've been looking for! 💪
Python
17
star
53

TempoRL

Python
15
star
54

HPO_for_RL

This is the code of reproducing the results of our paper: On the importance of Hyperparameter Optimization for Model-based Reinforcement Learning
Python
15
star
55

hierarchical_nas_construction

Official repository for "Construction of Hierarchical Neural Architecture Search Spaces based on Context-free Grammars" (NeurIPS 2023)
Python
14
star
56

transfer-hpo-framework

Code accompanying https://arxiv.org/abs/1802.02219
Python
14
star
57

jahs_bench_201

The first collection of surrogate benchmarks for Joint Architecture and Hyperparameter Search.
Python
14
star
58

Squirrel-Optimizer-BBO-NeurIPS20-automlorg

Python
13
star
59

SVGe

Smooth Variational Graph Embeddings for Efficient Neural Architecture Search
Python
13
star
60

is_mamba_capable_of_icl

Jupyter Notebook
12
star
61

ChaLearn_Automatic_Machine_Learning_Challenge_2015

Python
11
star
62

GenericWrapper4AC

C++
11
star
63

HW-GPT-Bench

HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models
Python
11
star
64

QTT

A framework for QuickTune
Jupyter Notebook
11
star
65

ifBO

In-context Bayesian Optimization
Python
11
star
66

arlbench

HPO and Architecture Benchmarking for RL: Dynamically, Reactive and Efficient
Python
10
star
67

lcpfn

Jupyter Notebook
10
star
68

ASKL2.0_experiments

Jupyter Notebook
9
star
69

mf-prior-bench

A collection of multi-fidelity benchmarks with first class support for user priors
Python
8
star
70

LTO-CMA

Code for the paper "Learning Step-Size Adaptation in CMA-ES"
Python
8
star
71

EfficientNAS

Python
8
star
72

HPOlibConfigSpace

Python
8
star
73

paramsklearn

Python
8
star
74

multibeep

A Multi Armed Bandit library written in C++ with Python bindings
C
8
star
75

MODNAS

Official Repo for "Multi-objective Differentiable Neural Architecture Search"
Python
8
star
76

TabularTempoRL

Code for the paper "Towards TempoRL: Learning When to Act"
Python
8
star
77

CARP-S

A Framework for Comparing N Hyperparameter Optimizers on M Benchmarks.
Jupyter Notebook
8
star
78

HPOBenchExperimentUtils

Experiment code to run large-scale experimente with HPOBench
Python
7
star
79

hypersweeper

Hydra sweeper integration of our favorite optimization packages, utilizing ask-and-tell interfaces.
Python
6
star
80

HPOlib-hpconvnet

A wrapper for James Bergstras hyperopt convnet
Python
5
star
81

SPaCE

Jupyter Notebook
5
star
82

dac4automlcomp

DAC4AutoML Competition
HTML
5
star
83

IMFAS

Implicit Multi-Fidelity Algorithm Selection
Python
5
star
84

automl_common

This repository holds shared utilities that AutoML frameworks may benefit from.
Python
5
star
85

ParameterConfigSpace

parameter configuration space parser for SMAC format
Python
5
star
86

masif

MASIF: Meta-learned Algorithm Selection using Implicit Fidelity Information
Python
5
star
87

DAC4SGD

Python
5
star
88

automl_template

A template that provides all the tools to ensure the same project setup across all AutoML packages.
Python
5
star
89

AutoRL-Landscape

Python
4
star
90

SAWEI

Jupyter Notebook
4
star
91

hydra-smac-sweeper

Sweeper plugin based on SMAC for Hydra.
Python
4
star
92

DAC4RL

DAC4RL track of DAC4AutoML competition at AutoML Conf
Python
4
star
93

pi_is_back

Repo for "PI is back! Switching Acquisition Functions in Bayesian Optimization" (NeurIPS: Gaussian Process Workshop '22)
Python
4
star
94

BO-AFS

For BO: Select Acquisition Function (Schedule) with Meta-Learned Model Per-Run
Jupyter Notebook
4
star
95

HPOlib-AutoWEKA

Python
3
star
96

AutoDLComp19

AutoDL Competition Scripts 2019
Python
3
star
97

mf-prior-exp

Python
3
star
98

ICGen

Image Classification Dataset Generator
Python
3
star
99

AutomlCup2023

Code for the AutoMLCup 2023
Python
3
star
100

FOB

Python
3
star