Hyperparameter Tuning With Determined¶
Hyperparameter tuning is a common machine learning workflow that involves appropriately configuring the data, model architecture, and learning algorithm to yield an effective model. Hyperparameter tuning is a challenging problem in deep learning given the potentially large number of hyperparameters to consider. Determined provides support for hyperparameter search as a first-class workflow that is tightly integrated with Determined’s job scheduler, which allows for efficient execution of state-of-the-art early-stopping based approaches as well as seamless parallelization of these methods.
Our default recommended search method is adaptive search, an early-stopping based technique that speeds up traditional techniques like random search by periodically abandoning low-performing hyperparameter configurations in a principled fashion. Adaptive search builds on the two prototypical adaptive downsampling approaches, Successive Halving (SHA) and Hyperband, while also enabling seamless parallelization and a simpler user interface.
Determined offers two adaptive searchers that employ the same underlying algorithm but differ in their level of configurability.
Adaptive (Simple) is easier to configure and provides sensible default settings for most situations. This searcher requires just two intuitive settings: the number of configurations to evaluate, and the maximum allowed resource budget per configuration. We recommend starting with this searcher.
Adaptive (Advanced) allows users to control the adaptive search behavior in a more fine-grained way.
Other Supported Methods¶
Determined also supports other common hyperparameter search algorithms:
Single is appropriate for manual hyperparameter tuning, as it trains a single hyperparameter configuration.
Grid brute force evaluates all possible hyperparameter configurations and returns the best.
Random evaluates a set of hyperparameter configurations chosen at random and returns the best.
Population-based training (PBT) begins as random search but periodically replaces low-performing hyperparameter configurations with ones near the high-performing points in the hyperparameter space.