Survey on End-To-End Machine Learning Automation

In this repository, we present the references mentioned in a comprehensive survey for the state-of-the-art efforts in tackling the automation of Machine Learning AutoML, wether through fully automation to the role of data scientist or using some aiding tools that minimize the role of human in the loop. First, we focus on the Combined Algorithm Selection, and Hyperparameter Tuning (CASH) problem. In addition, we highlight the research work of automating the other steps of the full complex machine learning pipeline from data understanding till model deployment. Furthermore, we provide a comprehensive coverage for the various tools and frameworks that have been introduced in this domain.

Table of Contents & Organization:

This repository will be organized into 6 separate sections:

Meta-Learning Techniques for AutoML search problem
Neural Architecture Search Problem
Hyper-Parameter Optimization
- Black Box Optimization
- Multi-Fidelity Optimization
  - Modeling Learning Curve
  - Bandit Based
AutoML Tools and Frameworks
Pre-Modeling and Post-Modeling Aiding Tools
- Pre-Modeling
- Post-Modeling
AutoML Competitions

Meta-Learning Techniques for AutoML search problem:

Meta-learning can be described as the process of leaning from previous experience gained during applying various learning algorithms on different kinds of data, and hence reducing the needed time to learn new tasks.

2018 | Meta-Learning: A Survey. | Vanschoren | CoRR | PDF
2008 | Metalearning: Applications to data mining | Brazdil et al. | Springer Science & Business Media | PDF

Learning From Model Evaluation

Surrogate Models
- 2018 | Scalable Gaussian process-based transfer surrogates for hyperparameter optimization. | Wistuba et al. | Journal of ML | PDF
Warm-Started Multi-task Learning
- 2017 | Multiple adaptive Bayesian linear regression for scalable Bayesian optimization with warm start. | Perrone et al. | PDF
Relative Landmarks
- 2001 | An evaluation of landmarking variants. | Furnkranz and Petrak | ECML/PKDD | PDF

Learning From Task Properties

Using Meta-Features
- 2019 | SmartML: A Meta Learning-Based Framework for Automated Selection and Hyperparameter Tuning for Machine Learning Algorithms. | Maher and Sakr | EDBT | PDF
- 2017 | On the predictive power of meta-features in OpenML. | Bilalli et al. | IJAMC | PDF
- 2013 | Collaborative hyperparameter tuning. | Bardenet et al. | ICML | PDF
Using Meta-Models
- 2018 | Predicting hyperparameters from meta-features in binary classification problems. | Nisioti et al. | ICML | PDF
- 2014 | Automatic classifier selection for non-experts. Pattern Analysis and Applications. | Reif et al. | PDF
- 2012 | Imagenet classification with deep convolutional neural networks. | Krizhevsky et al. | NIPS | PDF
- 2008 | Predicting the performance of learning algorithms using support vector machines as meta-regressors. | Guerra et al. | ICANN | PDF
- 2008 | Metalearning-a tutorial. | Giraud-Carrier | ICMLA | PDF
- 2004 | Metalearning: Applications to data mining. | Soares et al. | Springer Science & Business Media | PDF
- 2004 | Selection of time series forecasting models based on performance information. | dos Santos et al. | HIS | PDF
- 2003 | Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results. | Brazdil et al. | Journal of ML | PDF
- 2002 | Combination of task description strategies and case base properties for meta-learning. | Kopf and Iglezakis | PDF

Learning From Prior Models

Transfer Learning
- 2014 | How transferable are features in deep neural networks? | Yosinski et al. | NIPS | PDF
- 2014 | CNN features offthe-shelf: an astounding baseline for recognition. | Sharif Razavian et al. | IEEE CVPR | PDF
- 2014 | Decaf: A deep convolutional activation feature for generic visual recognition. | Donahue et al. | ICML | PDF
- 2012 | Imagenet classification with deep convolutional neural networks. | Krizhevsky et al. | NIPS | PDF
- 2012 | Deep learning of representations for unsupervised and transfer learning. | Bengio | ICML | PDF
- 2010 | A survey on transfer learning. | Pan and Yang | IEEE TKDE | PDF
- 1995 | Learning many related tasks at the same time with backpropagation. | Caruana | NIPS | PDF
- 1995 | Learning internal representations. | Baxter | PDF
Few-Shot Learning
- 2017 | Prototypical networks for few-shot learning. | Snell et al. | NIPS | PDF
- 2017 | Meta-Learning: A Survey. | Vanschoren | CoRR | PDF
- 2016 | Optimization as a model for few-shot learning. | Ravi and Larochelle | PDF

Neural Architecture Search Problem

Neural Architecture Search (NAS) is a fundamental step in automating the machine learning process and has been successfully used to design the model architecture for image and language tasks.

2018 | Progressive neural architecture search. | Liu et al. | ECCV | PDF
2018 | Efficient architecture search by network transformation. | Cai et al. | AAAI | PDF
2018 | Learning transferable architectures for scalable image recognition. | Zoph et al. | IEEE CVPR | PDF
2017 | Hierarchical representations for efficient architecture search. | Liu et al. | PDF
2016 | Neural architecture search with reinforcement learning. | Zoph and Le | PDF
2009 | Learning deep architectures for AI. | Bengio et al. | PDF

Random Search
- 2019 | Random Search and Reproducibility for Neural Architecture Search. | Li and Talwalkar | PDF
- 2017 | Train Longer, Generalize Better: Closing the Generalization Gap in Large Batch Training of Neural Networks. | Hoffer et al. | NIPS | PDF
Reinforcement Learning
- 2019 | Neural architecture search with reinforcement learning. | Zoph and Le | PDF
- 2019 | Designing neural network architectures using reinforcement learning. | Baker et al. | PDF
Evolutionary Methods
- 2019 | Evolutionary Neural AutoML for Deep Learning. | Liang et al. | PDF
- 2019 | Evolving deep neural networks. | Miikkulainen et al. | PDF
- 2018 | a multi-objective genetic algorithm for neural architecture search. | Lu et al. | PDF
- 2018 | Efficient multi-objective neural architecture search via lamarckian evolution. | Elsken et al. | PDF
- 2018 | Regularized evolution for image classifier architecture search. | Real et al. | PDF
- 2017 | Large-scale evolution of image classifiers | Real et al. | ICML | PDF
- 2017 | Hierarchical representations for efficient architecture search. | Liu et al. | PDF
- 2009 | A hypercube-based encoding for evolving large-scale neural networks. | Stanley et al. | Artificial Life | PDF
- 2002 | Evolving neural networks through augmenting topologies. | Stanley and Miikkulainen | Evolutionary Computation | PDF
Gradient Based Methods
- 2018 | Differentiable neural network architecture search. | Shin et al. | PDF
- 2018 | Darts: Differentiable architecture search. | Liu et al. | PDF
- 2018 | MaskConnect: Connectivity Learning by Gradient Descent. | Ahmed and Torresani | PDF
Bayesian Optimization
- 2018 | Towards reproducible neural architecture and hyperparameter search. | Klein et al. | PDF
- 2018 | Neural Architecture Search with Bayesian Optimisation and Optimal Transport | Kandasamy et al. | NIPS | PDF
- 2016 | Towards automatically-tuned neural networks. | Mendoza et al. | PMLR | PDF
- 2015 | Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. | Domhan et al. | IJCAI | PDF
- 2014 | Raiders of the lost architecture: Kernels for Bayesian optimization in conditional parameter spaces. | Swersky et al. | PDF
- 2013 | Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. | Bergstra et al. | PDF Github (Hyperopt)
- 2011 | Algorithms for hyper-parameter optimization. | Bergstra et al. | NIPS | PDF

Hyper-Parameter Optimization

After choosing the model pipeline algorithm(s) with the highest potential for achieving the top performance on the input dataset, the next step is tuning the hyper-parameters of such model in order to further optimize the model performance. It is worth mentioning that some tools have democratized the space of different learning algorithms in discrete number of model pipelines. So, the model selection itself can be considered as a categorical parameter that needs to be tuned in the first place before modifying its hyper-parameters.

Black Box Optimization

Grid and Random Search
- 2017 | Design and analysis of experiments. | Montgomery | PDF
- 2015 | Adaptive control processes: a guided tour. | Bellman | PDF
- 2012 | Random search for hyper-parameter optimization. | Bergstra and Bengio | JMLR | PDF
Bayesian Optimization
- 2018 | Bohb: Robust and efficient hyperparameter optimization at scale. | Falkner et al. | JMLR | PDF
- 2017 | On the state of the art of evaluation in neural language models. | Melis et al. | PDF
- 2015 | Automating model search for large scale machine learning. | Sparks et al. | ACM-SCC | PDF
- 2015 | Scalable bayesian optimization using deep neural networks. | Snoek et al. | ICML | PDF
- 2014 | Bayesopt: A bayesian optimization library for nonlinear optimization, experimental design and bandits. | Martinez-Cantin | JMLR | PDF
- 2013 | Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. | Bergstra et al. | PDF
- 2013 | Towards an empirical foundation for assessing bayesian optimization of hyperparameters. | Eggensperger et al. | NIPS | PDF
- 2013 | Improving deep neural networks for LVCSR using rectified linear units and dropout. | Dahl et al. | IEEE-ICASSP | PDF
- 2012 | Practical bayesian optimization of machine learning algorithms. | Snoek et al. | NIPS | PDF Github (Spearmint)
- 2011 | Sequential model-based optimization for general algorithm configuration. | Hutter et al. | LION | PDF Github
- 2011 | Algorithms for hyper-parameter optimization. | Bergstra et al. | NIPS | PDF
- 1998 | Efficient global optimization of expensive black-box functions. | Jones et al. | PDF
- 1978 | Adaptive control processes: a guided tour. | Mockus et al. | PDF
- 1975 | Single-step Bayesian search method for an extremum of functions of a single variable. | Zhilinskas | PDF
- 1964 | A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. | Kushner | PDF
Simulated Annealing
- 1983 | Optimization by simulated annealing. | Kirkpatrick et al. | Science | PDF
Genetic Algorithms
- 1992 | Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. | Holland et al. | PDF

Multi-Fidelity Optimization

2019 | Practical Multi-fidelity Bayesian Optimization for Hyperparameter Tuning. | Wu et al. | PDF
2019 | Multi-Fidelity Automatic Hyper-Parameter Tuning via Transfer Series Expansion. | Hu et al. | PDF
2016 | Review of multi-fidelity models. | Fernandez-Godino | PDF
2012 | Provably convergent multifidelity optimization algorithm not requiring high-fidelity derivatives. | March and Willcox | AIAA | PDF

Modeling Learning Curve
- 2017 | Learning curve prediction with Bayesian neural networks. | Klein et al. | ICLR | PDF
- 2015 | Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. | Domhan et al. | IJCAI | PDF
- 1998 | Efficient global optimization of expensive black-box functions. | Jones et al. | JGO | PDF
Bandit Based
- 2018 | Massively parallel hyperparameter tuning. | Li et al. | AISTATS | PDF
- 2016 | Non-stochastic Best Arm Identification and Hyperparameter Optimization. | Jamieson and Talwalkar | AISTATS | PDF
- 2016 | Hyperband: A novel bandit-based approach to hyperparameter optimization. | Kirkpatrick et al. | JMLR | PDF Github Github (Distributed Hyperband - BOHB)

AutoML Tools and Frameworks

Centralized Frameworks

	Date	Language	Training Framework	Optimization Method	ML Tasks	Meta-Learning	UI	Open Source
`AutoWeka`	2013	Java	Weka	Bayesian Optimization	Single-label classification regression	×	√	`Github` 'Tool'
`HyperOpt-Sklearn`	2014	Python	Scikit-Learn	Bayesian Optimization, Simulated Annealing, and Random Search	Single-label classification regression	×	×	`Github`
`AutoSklearn`	2015	Python	Scikit-Learn	Bayesian Optimization	Single-label classification regression	√	×	`Github` 'Tool'
`TPOT`	2016	Python	Scikit-Learn	Genetic Algorithm	Single-label classification regression	×	×	`Github`
`Recipe`	2017	Python	Scikit-Learn	Grammer-Based Genetic Algorithm	Single-label classification	√	×	`Github`
`Auto-Meka`	2018	Java	Meka	Grammer-Based Genetic Algorithm	Multi-label classification	√	×	`Github`
`ML-Plan`	2018	Java	Weka / Scikit-Learn	Hierarchical Task Planning	Single-label classification	×	×	`Github`
`AutoStacker`	2018	-	-	Genetic Algorithm	Single-label classification	×	×	×
`PMF`	2018	Python	Scikit-Learn	Collaborative Filtering and Bayesian Optimization	Single-label classification	√	×	`Github`
`AlphaD3M`	2018	-	-	Reinforcement Learning	Single-label classification regression	√	×	×
`SmartML`	2019	R	Different R Packages	Bayesian Optimization	Single-label classification	√	√	`Github`
`VDS`	2019	-	-	Cost-Based Multi-Armed Bandits and Bayesian Optimization	Single-label classification, regression, image classification, audio classification, graph matching	√	√	×
`OBOE`	2019	Python	Scikit-Learn	Collaborative Filtering	Single-label classification	√	×	`Github`
`Auptimizer`	2019			Random, Grid, Hyperband, Hyperopt, Spearmint	Single-label classification	x	×	`Github`
`iSmartML`	2019	Python	Scikit-Learn	Bayesian Optimization	Single-label classification regression	√	√	`Github` 'Tool'

Distributed Frameworks

	Date	Language	Training Framework	Optimization Method	Meta-Learning	UI	Open Source	PDF
MLBase	2013	Scala	SparkMlib	Cost-based Multi-Armed Bandits	×	×	× `Website`	`PDF`
ATM	2017	Python	Scikit-Learn	Hybrid Bayesian, and Multi-armed bandits Optimization	√	×	`Github`	`PDF`
MLBox	2017	Python	Scikit-Learn Keras	Distributed Random search, and Tree-Parzen estimators	×	×	`Github`	×
Rafiki	2018	Python	Scikit-Learn TensorFlow	Distributed random search, Bayesian Optimization	×	√	`Github`	`PDF`
TransmogrifAI	2018	Scala	SparkML	Bayesian Optimization, and Random Search	×	×	`Github` `Website`	×
ATMSeer	2019	Python	Scikit-Learn On Top Of ATM	Hybrid Bayesian, and Multi-armed bandits Optimization	√	√	`Github`	`PDF`
D-SmartML	2019	Scala	SparkMlib	Grid Search, Random Search, Hyperband	√	x	`Github`	x
Databricks	2019	Python	SparkMlib	Hyperopt	x	√	× `Website`	x

Cloud-Based Frameworks
- Google AutoML | URL
- Azure AutoML | URL
- Amazon SageMaker | URL
NAS Frameworks

	Date	Supported Architectures	Optimization Method	Supported Frameworks	UI	Open Source	PDF
AutoNet	2016	FCN	SMAC	PyTorch	×	`Github`	`PDF`
Auto-Keras	2018	No Restrictions	Network Morphism	Keras	√	`Github`	`PDF`
enas	2018	CNN, RNN	Reinforcement Learning	TensorFlow	×	`Github`	`PDF`
NAO	2018	CNN, RNN	Gradient based optimization	TensorFlow PyTorch	×	`Github`	`PDF`
DARTS	2019	No Restrictions	Gradient based optimization	PyTorch	×	`Github`	`PDF`
NNI	2019	No Restrictions	Random and GridSearch, Different Bayesian Optimizations, Annealing, Network Morphism, Hyper-Band, Naive Evolution	PyTorch, TensorFlow, Keras, Caffe2, CNTK, Chainer, Theano	√	`Github`	×

Pre-Modeling and Post-Modeling Aiding Tools

While current different AutoML tools and frameworks have minimized the role of data scientist in the modeling part and saved much effort, there is still several aspects that need human intervention and interpretability in order to make the correct decisions that can enhance and affect the modeling steps. These aspects belongs to two main building blocks of the machine learning production pipeline: Pre-Modeling and PostModeling.

The aspects of these two building blocks can help on covering what is missed in current AutoML tools, and help data scientists in doing their job in a much easier, organized, and informative way.

Pre-Modeling

Data Understanding
- Sanity Checking
  - 2017 | Controlling False Discoveries During Interactive Data Exploration. | Zhao et al. | SIGMOD | PDF
  - 2016 | Data Exploration with Zenvisage: An Expressive and Interactive Visual Analytics System. | Siddiqui et al. | VLDB | PDF | TOOL
  - 2015 | SEEDB: Efficient Data-Driven Visualization Recommendations to Support Visual Analytics. | Vartak et al. | PVLDB | PDF | TOOL
- Feature Based Analysis
  - 2016 | Visual Exploration of Machine Learning Results Using Data Cube Analysis. | Kahng et al. | HILDA | PDF
  - 2015 | Smart Drill-down: A New Data Exploration Operator. | Joglekar et al. | VLDB | PDF
- Data Life-Cycle Analysis
  - 2017 | Ground: A Data Context Service | Hellerstein et al. | CIDR | PDF | URL
  - 2016 | ProvDB: A System for Lifecycle Management of Collaborative Analysis Workflows. | Miao et al. | CoRR | PDF | Github
  - 2016 | Goods: Organizing Google’s Datasets. | Halevy et al. | SIGMOD | PDF
Data Validation
- Automatic Correction
  - 2017 | MacroBase: Prioritizing Attention in Fast Data. | Bailis et al. | SIGMOD | PDF | Github
  - 2015 | Data X-Ray: A Diagnostic Tool for Data Errors. | Wang et al. | SIGMOD | PDF
- Automatic Alerting
  - 2009 | On Approximating Optimum Repairs for Functional Dependency Violations. | Kolahi and Lakshmanan | ICDT | PDF
  - 2005 | A Cost-based Model and Effective Heuristic for Repairing Constraints by Value Modification. | Bohannon et al. | SIGMOD | PDF
Data Preparation
- Feature Addition
  - 2018 | Google Search Engine for Datasets | URL
  - 2014 | DataHub: Collaborative Data Science & Dataset Version Management at Scale. | Bhardwaj et al. | CoRR | PDF | URL
  - 2013 | OpenML: Networked Science in Machine Learning. | Vanschoren et al. | SIGKDD | PDF | URL
  - 2007 | UCI: Machine Learning Repository. | Dua, D. and Graff, C. | URL
- Feature Synthesis
  - 2015 | Deep feature synthesis: Towards automating data science endeavors. | Kanter and Veeramachaneni | DSAA | PDF | Github

Post-Modeling

2019 | Model Chimp | URL
2018 | ML-Flow | URL
2017 | Datmo | URL

AutoML Challenges

2019 | Third AutoML Challenge | URL
2018 | Second AutoML Challenge | URL
2017 | First AutoML Challenge | URL

Contribute:

To contribute a change to add more references to our repository, you can follow these steps:

Create a branch in git and make your changes.
Push branch to github and issue pull request (PR).
Discuss the pull request.
We are going to review the request, and merge it to the repository.

Citation:

For more details, please refer to our Survey Paper PDF

Radwa El-Shawi, Mohamed Maher, Sherif Sakr., Automated Machine Learning: State-of-The-Art and Open Challenges (2019).

DataSystemsGroupUT/AutoML_Survey

DataSystemsGroupUT

Reviews

Repository Details

Survey on End-To-End Machine Learning Automation

Table of Contents & Organization:

Meta-Learning Techniques for AutoML search problem:

Learning From Model Evaluation

Surrogate Models

Warm-Started Multi-task Learning

Relative Landmarks

Learning From Task Properties

Using Meta-Features

Using Meta-Models

Learning From Prior Models

Transfer Learning

Few-Shot Learning

Neural Architecture Search Problem

Random Search

Reinforcement Learning

Evolutionary Methods

Gradient Based Methods

Bayesian Optimization

Hyper-Parameter Optimization

Black Box Optimization

Grid and Random Search

Bayesian Optimization

Simulated Annealing

Genetic Algorithms

Multi-Fidelity Optimization

Modeling Learning Curve

Bandit Based

AutoML Tools and Frameworks

Centralized Frameworks

Distributed Frameworks

Cloud-Based Frameworks

NAS Frameworks

Pre-Modeling and Post-Modeling Aiding Tools

Pre-Modeling

Data Understanding

Sanity Checking

Feature Based Analysis

Data Life-Cycle Analysis

Data Validation

Automatic Correction

Automatic Alerting

Data Preparation

Feature Addition

Feature Synthesis

Post-Modeling

AutoML Challenges

Contribute:

Citation:

More Repositories