Quick Start Chapter 2: Hyperparameter Search¶
For the TensorFlow MNIST example discussed on the previous page, a random hyperparameter search can be started with the command:
pedl e create random.yaml .
To run our proprietary search algorithm on the same example, use
adaptive.yaml instead of
The type of hyperparameter search is specified within the experiment configuration file under the
searcher fields. For further documentation see the hyperparameter search page. We'll give a brief overview here.
Experiment Configuration File¶
The experiment configuration file specifies metadata for the experiment, including how the experiment should be deployed on a cluster and how data storage should be handled. For further documentation see the experiment configuration page.
The main fields include:
description: A human-readable string describing the experiment.
data: Where to find the dataset and how to process it.
tensorboard_storage: For saving the progress of an experiment.
hyperparameters: Defines the hyperparameters of the model, either as a set value or a range of values for the searcher to consider.
searcher: Specifies the hyperparameter search algorithm and searcher-related configurations.
resources: Specifies constraints on the cluster resources that this experiment is allowed to use.
A hyperparameter search algorithm works on a model, defined ranges of the model's hyperparameters, and a dataset; its goal is to find the set of hyperparameter values within the hyperparameter space that provides the best validation performance for the model. We call this algorithm the searcher.
Searchers may make many decisions on the experiment procedure, including which trials and how many trials to spawn, when to calculate validation metrics on the current state of trials, whether to continue training certain trials, and which trials should be terminated. The searcher collects validation data on how well each trial is performing.
The searchers provided by PEDL are:
single: training one fixed model
random: random search
grid: grid search
pbt: population-based training
adaptive: our search algorithm, see the documentation
const.yaml file uses the
single searcher mode, which runs only one trial with user-specified hyperparameters for a user-specified number of steps (so it is quite a trivial searcher). The
random searcher mode runs a specified number of trials a specified number of steps; the trials are randomly generated by sampling from the configured hyperparameter space. The
grid searcher mode runs a specified "grid" of hyperparameter values for a specified number of steps. The
adaptive searcher mode runs a proprietary search algorithm based on Asynchronous Successive Halving Search and Hyperband. The
adaptive searcher compares favorably to vanilla versions of both these algorithms.
Hyperparameter versus Searcher Fields¶
Of particular note is that the
hyperparameters field defines the space of hyperparameters within which the searcher is trying to find the set of hyperparameter values with optimal validation performance. The hyperparameter space to use for a given experiment is user-defined; importantly, hyperparameters can control aspects of the network architecture (dropout, number of layers, etc.), the training procedure (batch size, learning rate, etc.), and data preprocessing/augmentation.
On the other hand, the
adaptive, etc.) defines the types of search algorithm to be performed over the space. Each searcher mode has searcher-specific parameters that must be defined in the experiment configuration file; these determine exactly how the searcher is configured. These searcher-specific parameters are not part of the hyperparameter search space and must be specified within the
searcher field of the experiment configuration file.
mnist_tf example directory has one
.yaml file per searcher mode; they give examples of how to implement each searcher mode using PEDL. A side-by-side comparison of all
.yaml files for
mnist_tf may be helpful.