Automated machine learning (AutoML) is getting more and more attention. There are many frameworks out there that cover certain aspects of automated machine learning. Many comparisons are made on “standardized” benchmarks. With this series I am going to test these frameworks on benchmarks and less known datasets with minimal manual interference. Further, I am going to compare them with “manual machine learning solutions” and look under the hood and explain how they work.
This site and the content is under reconstruction! (most links are broken)
Contents
- Introduction
- General Topics on AutoML
- Introduction to Auto-Keras Series
- Introduction to Auto-Sklearn series
- Introduction to TPOT series
- Introduction to Google AdaNet series
- Interesting publications and tutorials on automated machine learning
- Interesting software packages that are not (yet) covered by any of my blog posts
Introduction
I am going to limit run times of AutoML frameworks to 1 hour (most of the time). There are 3 reasons for it:
- 1 hour is enough to see if it moves into a useful direction and there is no need to spend 24+x hours to get some initial ideas how a framework performs. Furthermore, it points out the differences between theory (pure mathematics, computational theory) that proofs mathematically that the solution will be optimal in a infinite amount of time (I do not know about you, but I am restricted to a finite amount of time and have other things to do as well ;) ) and practical applications where it should at least outperform a “studpid/brute-force type” of manual machine learning approach.
- Most of my examples are on datasets from the physical world and therefore much smaller than standard deep learning datasets. 1 hour should be enough. Moreover, I want to get an understanding how they perform outside of benchmark datasets that are (carefully?) selected for publications (to make the frame work look good?).
- Most importantly: My “brute-force/stupid” approach consists of very simple pipeline with very basic preprocessing and mainly a testing a few machine learning algorithms with a small set of hyperparameters to tune using scikit-learn’s gridsearchcv. In most of the cases I am outperforming neural networks and other algorithms that were described in the original publication of a dataset. This pipeline requires usually 15 - 30 minutes. (SVM may cause extreme runtimes but is kicked out if I detect it.)
FYI: I’m working on a book on AutoML covering practical things. I hope to publish it in Q4/2019.
General Topics on AutoML
- Automated Machine Learning (AutoML)
- Does AutoML (Automated Machine Learning) lead to better models and fulfill legal requirements?
Introduction to Auto-Keras Series
- Introduction to Auto-Keras
- Auto-Keras Toy example - MNIST
- Auto-Keras Toy example - Fashion-MNIST
- More Adventures with Auto-Keras (Please start reading here since many of previously encountered problems have been solved.)
- Auto-Keras Toy example - CIFAR 10 and CIFAR 100
- Auto-Keras - 2D Image Augmentation Explained
- Auto-Keras - Constants Explained
- Auto-Keras Toy example - German Traffic Sign Recognition Benchmark
- Auto-Keras (Image) Regression
- Auto-Keras Toy example - German Traffic Sign Recognition Benchmark - Part 2
- Auto-Keras for Detecting Hills and Valleys
- Exploring the Structure of Auto-Keras
- Auto-Keras for Land Classification
- First Impressions of Auto-Keras v.0.3.x
- Auto-Keras for Aircraft RADAR Signature Classification
- Auto-Keras on Kuzushiji-MNIST
- Auto-Keras on Kuzushiji-49
- Save and load models with autokeras
- Auto-Keras - a part of keras now?
- Auto-Keras for EMNIST
- comming soon: Auto-Keras for self-driving cars <- once a train/valid generator is implemented…
Introduction to Auto-Sklearn series
- Introduction to Auto-Sklearn
- Auto-sklearn for predicting Concrete compressive strength
- Auto-sklearn for predicting Yacht hydrodynamics
- Auto-sklearn for predicting seismic bumps for coal mining hazard assessment
- Auto-Sklearn for Predicting Concrete Slump Testing
- Auto-sklearn - Sonar: Mines vs. Rocks
- Auto-sklearn - Preprocessing Pipeline (NOT?) Explained Part 0 - Meta-Learning
- Auto-sklearn - Preprocessing Pipeline Explained Part 1 - Data Preprocessing
- Auto-sklearn - Preprocessing Pipeline Explained Part 2 - Feature Preprocessing
- Exploring the Structure of Auto-sklearn
- Auto-sklearn for Hill - Valley Detection
- Auto-sklearn for Prediction of Scania APS Failure
- Auto-sklearn for Steel Plates
- Auto-sklearn for Airfoil Noise
- TPOT and AutoSklearn on Kuzushiji-KMNIST and Kuzushiji-49
- Auto-Sklearn for CT Scan Slice Localization
- Auto-Sklearn for Banknote Authentication
- Auto-Sklearn for Website Phishing
- Auto-Sklearn for Superconductivity
- Auto-Sklearn for Glass Identification
- Auto-Sklearn 2.0 released
Introduction to TPOT series
- Introduction to TPOT
- TPOT Toy Example - MNIST
- TPOT for Predicting Yacht Hydrodynamics
- TPOT for Prediction of Seismic Bumps for Coal Mine Hazard Assessment
- TPOT for Classification of Sonar Readings
- Exploring the Structure of TPOT
- TPOT for Hill - Valley Detection
- TPOT for Prediction of Scania APS Failure
- TPOT for Sensorless Drive Diagnosis
- TPOT for Steel Plates
- TPOT for Forest Type Classification
- TPOT for concrete compressive strength
- TPOT for Airfoil Noise
- TPOT for urban land cover classification
- TPOT and AutoSklearn on Kuzushiji-KMNIST and Kuzushiji-49
- TPOT for CT Scan Slice Localization
- TPOT for Banknote Authentication
- TPOT for Website Phishing
- TPOT for Superconductivity
- TPOT for Glass Identification
- TPOT for Arcene
- TPOT for Naval Propulsion Maintenance Prediction
- TPOT for Forest Fire prediction
- Using TPOT in a Kaggle competition - how to generate a complete failure submission on the Instant Gratification Challenge
- TPOT for Metro Interstate Traffic Volume
- TPOT for EMNIST
- TPOT for ionosphere radar signal classification
- TPOT for energy efficiency
- TPOT for QSAR Aquatic Toxicity
- TPOT for QSAR Fish Toxicity
Introduction to Google AdaNet series
- Introduction to Google AdaNet
- Structure of Google AdaNet
- AdaNet for Aircraft RADAR Signature Classification
- AdaNet on Kuzushiji-MNIST
- AdaNet on Kuzushiji-49
Interesting publications and tutorials on automated machine learning
General
-
Awad et al. (2020): Differential Evolution for Neural Architecture Search. arXiv:2012.06400
-
Feurer et al. (2020): Auto-Sklearn 2.0: The Next Generation. arXiv:2007.04074
-
Hu et al. (2020): Multi-objective Neural Architecture Search with Almost No Training. arXiv:2011.13591
-
Kedziora et al. (2020): AutonoML: Towards an Integrated Framework for Autonomous Machine Learning. arXiv:2012.12600
-
Xie et al. (2020): Skillearn: Machine Learning Inspired by Humans’ Learning Skills. arXiv:2012.04863
Distributed and decentralized machine learning
- Pramod (2018): Elastic Gossip: Distributing Neural Network Training Using Gossip-like Protocols. arXiv:1812.02407
Feature engineering
- James Max Kanter, Kalyan Veeramachaneni. Deep feature synthesis: Towards automating data science endeavors. IEEE DSAA 2015. available online: https://dai.lids.mit.edu/wp-content/uploads/2017/10/DSAA_DSM_2015.pdf.
Meta learning
- Shaw et al. (2018): Bayesian Meta-network Architecture Learning. arXiv:1812.09584
Neural architecture search
-
Anderson et al. (2020): Performance-Oriented Neural Architecture Search. arXiv:2001.02976
-
Bulat et al. (2020): BATS: Binary ArchitecTure Search. arXiv:2003.01711
- Cai et al. (2018): ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware. arXiv:1812.00332
-
Chen et al. (2019): Binarized Neural Architecture Search. arXiv:1911.10862
-
de Laroussilhe et al. (2018): Neural Architecture Search Over a Graph Search Space. arXiv:1812.10666
-
Geifman and El-Yaniv (2018): Deep Active Learning with a Neural Architecture Search. arXiv:1811.07579
-
Huang and Chu (2020): PONAS: Progressive One-shot Neural Architecture Search for Very Efficient Deployment. arXiv:2003.05112
- Lindauer and Hutter (2020): Best Practices for Scientific Research on Neural Architecture Search. JMLR 21(243): 1-18
- Liu et al. (2017): Progressive Neural Architecture Search. arXiv:1712.00559
- Liu et al. (2018): DARTS: Differentiable Architecture Search. arXiv:1806.09055
- Liu et al. (2020): Are Labels Necessary for Neural Architecture Search? arXiv:2003.12056
-
Luo et al. (2020): Semi-Supervised Neural Architecture Search. arXiv:2002.10389
-
Tan et al. (2018): MnasNet: Platform-Aware Neural Architecture Search for Mobile. arXiv:1807.11626
-
van Wyk and Bosman (2018): Evolutionary Neural Architecture Search for Image Restoration. arXiv:1812.05866
- White et al. (2019): BANANAS: Bayesian Optimization with Neural Architectures for Neural Architecture Search. arXiv:1910.11858
-
Wu et al. (2018): Mixed Precision Quantization of ConvNets via Differentiable Neural Architecture Search. arXiv:1812.00090
-
Xie et al. (2018): SNAS: Stochastic Neural Architecture Search. arXiv:1812.09926
- Zela et al. (2020): NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search. arXiv:2001.10422
- Zhang et al. (2019): Memory-Efficient Hierarchical Neural Architecture Search for Image Denoising. arXiv:1909.08228
Misc autoML and optimization related
-
Drori et al. (2019): Automatic Machine Learning by Pipeline Synthesis using Model-Based Reinforcement Learning and a Grammar. arXiv:1905.10345
-
Jones et al. (1998): Efficient Global Optimization of Expensive Black-Box Functions. Journal of Global Optimization 13, 455-492. doi:10.1023/A:1008306431147