Automated machine learning is a big topic of this blog. Therefore, we will look at another autoML framework: Auto-PyTorch.
Contents
Introducing Auto-PyTorch
This software is an early Alpha version. However, I expect at least some functionality and to be honest, Auto-Keras more usable in early alpha than now.
Auto-PyTorch originates from work by Mendoza et al . (2018) and uses BOHB to do neural architecture search with PyTorch.
Installation
The installation is not straight forward. However, let’s start with something simple and create a new conda environment for Auto-PyTorch:
(base) $ conda create -n autopytorch
NB!: installing PyTorch via conda doesn’t work without running into errors. Unfortunately, I was not able track the error down (with a reasonable amount of work), but is was so fatal that the error message was a core dump. Using pip
seems solves this problem.
(autopytorch) $ pip install torchvision pytorch
(autopytorch) $ git clone https://github.com/automl/Auto-PyTorch
(autopytorch) $ cd Auto-PyTorch
(autopytorch) $ cat requirements.txt | xargs -n 1 -L 1 pip install
(autopytorch) $ python setup.py install
That’s it. Now we can have a look at it.
First look: Kuzushiji-MNIST
Instead of starting with the same boring MNIST example, I decided to try it on Kuzushiji-MNIST:
import numpy as np
import matplotlib.pyplot as plt
from autoPyTorch import AutoNetClassification
import sklearn.metrics
def load(f):
return np.load(f)['arr_0']
X_train = load("./data/kmnist-train-imgs.npz")
X_test = load("./data/kmnist-test-imgs.npz")
y_train = load("./data/kmnist-train-labels.npz")
y_test = load("./data/kmnist-test-labels.npz")
# running Auto-PyTorch
autoPyTorch = AutoNetClassification("tiny_cs", # config preset
log_level='info',
max_runtime=300,
min_budget=30,
max_budget=90)
autoPyTorch.fit(X_train, y_train, validation_split=0.3)
y_pred = autoPyTorch.predict(X_test)
print("Accuracy score", sklearn.metrics.accuracy_score(y_test, y_pred))
Well? And? Well, I ended up with a constant error log that looks like this:
WORKER: start processing job (0, 0, 1)
Fit optimization pipeline
[AutoNet] No validation set given and either no cross validator given or budget too low for CV. Continue by splitting 0.3 of training data.
[AutoNet] CV split 0 of 1
Further, I want to add that Auto-PyTorch uses Pyro4 and therefore is susceptible to network connectivity (Pyro4.errors.CommunicationError: cannot connect to ('192.168.xxx.xxx', xxxxx): [Errno 101] Network is unreachable
). Hence, you may run into trouble running it on a notebook while traveling and enjoying areas without wifi or changing connectivity (please thank NetworkManager and probably systemd for this ;)).
Well, I re-tried it a bit and didn’t received any results but this error message:
RuntimeError: No models fit during training, please retry with a larger max_runtime.
Even one hour was not enough produce something useful. I tried up to two hours before deciding to stop my efforts here. From my understanding, it will use some form of resnet as a basis model. Therefore, I would have expected at least a simple result after a single iteration (similar to fastai)
Second First Look: MNIST
Let’s see if at least the example provided by the readme file on github works.
from autoPyTorch import AutoNetClassification
# data and metric imports
import sklearn.model_selection
import sklearn.datasets
import sklearn.metrics
X, y = sklearn.datasets.load_digits(return_X_y=True)
X_train, X_test, y_train, y_test = \
sklearn.model_selection.train_test_split(X, y, random_state=1)
# running Auto-PyTorch
autoPyTorch = AutoNetClassification("tiny_cs", # config preset
log_level='info',
max_runtime=300,
min_budget=30,
max_budget=90)
autoPyTorch.fit(X_train, y_train, validation_split=0.3)
y_pred = autoPyTorch.predict(X_test)
print("Accuracy score", sklearn.metrics.accuracy_score(y_test, y_pred))
Done Refitting
Accuracy score 0.98
Sounds good, right? Well, considering that it run for 5 minutes (300 sec) it probably performed on the level of a fastai one liner.