Custom Searcher Reference

determined.searcher.LocalSearchRunner

class determined.searcher.LocalSearchRunner(search_method: determined.searcher._search_method.SearchMethod, searcher_dir: Optional[pathlib.Path] = None)

LocalSearchRunner performs a search for optimal hyperparameter values, applying the provided SearchMethod. It is executed locally and interacts with a Determined cluster where it starts a multi-trial experiment. It then reacts to event notifications coming from the running experiments by forwarding them to event handler methods in your SearchMethod implementation and sending the returned operations back to the experiment.

run(exp_config: Union[Dict[str, Any], str], model_dir: Optional[str] = None) int

Run custom search.

Parameters
  • exp_config (dictionary, string) – experiment config filename (.yaml) or a dict.

  • model_dir (string) – directory containing model definition.

determined.searcher.RemoteSearchRunner

class determined.searcher.RemoteSearchRunner(search_method: determined.searcher._search_method.SearchMethod, context: determined.core._context.Context)

RemoteSearchRunner performs a search for optimal hyperparameter values, applying the provided SearchMethod (you will subclass SearchMethod and provide an instance of the derived class). RemoteSearchRunner executes on-cluster: it runs a meta-experiment using Core API.

run(exp_config: Union[Dict[str, Any], str], model_dir: Optional[str] = None) int

Run custom search as a Core API experiment (on-cluster).

Parameters
  • exp_config (dictionary, string) – experiment config filename (.yaml) or a dict.

  • model_dir (string) – directory containing model definition.

determined.searcher.SearchMethod

class determined.searcher.SearchMethod

The implementation of a custom hyperparameter tuning algorithm.

To implement your specific hyperparameter tuning approach, subclass SearchMethod overriding the event handler methods. Each event handler, except progress returns a list of operations (List[Operation]) that will be submitted to master for processing.

Note

Do not modify searcher_state passed into event handlers.

abstract initial_operations(searcher_state: determined.searcher._search_method.SearcherState) List[determined.searcher._search_method.Operation]

Returns a set of initial operations that the searcher will perform.

Currently, we support the following operations:

  • Create - starts a new trial with a unique trial id and a set of hyperparameter values,

  • ValidateAfter - sets number of steps (i.e., batches or epochs) after which a validation is run for a trial with a given id,

  • Close - closes a trial with a given id,

  • Shutdown - closes the experiment.

load(path: pathlib.Path) Tuple[determined.searcher._search_method.SearcherState, int]

Loads searcher state and method-specific state.

load_method_state(path: pathlib.Path) None

Loads method-specific search state.

abstract on_trial_closed(searcher_state: determined.searcher._search_method.SearcherState, request_id: uuid.UUID) List[determined.searcher._search_method.Operation]

Informs the searcher that a trial has been closed as a result of a Close operation.

abstract on_trial_created(searcher_state: determined.searcher._search_method.SearcherState, request_id: uuid.UUID) List[determined.searcher._search_method.Operation]

Informs the searcher that a trial has been created as a result of Create operation.

abstract on_trial_exited_early(searcher_state: determined.searcher._search_method.SearcherState, request_id: uuid.UUID, exited_reason: determined.searcher._search_method.ExitedReason) List[determined.searcher._search_method.Operation]

Informs the searcher that a trial has exited earlier than expected.

abstract on_validation_completed(searcher_state: determined.searcher._search_method.SearcherState, request_id: uuid.UUID, metric: float, train_length: int) List[determined.searcher._search_method.Operation]

Informs the searcher that the validation workload initiated by the same searcher has completed after training for train_length units. It returns any new operations as a result of this workload completing.

abstract progress(searcher_state: determined.searcher._search_method.SearcherState) float

Returns experiment progress as a float between 0 and 1.

save(searcher_state: determined.searcher._search_method.SearcherState, path: pathlib.Path, *, experiment_id: int) None

Saves the searcher state and the search method state. It will be called by the SearchRunner after receiving operations from the SearchMethod

save_method_state(path: pathlib.Path) None

Saves method-specific state

determined.searcher.SearcherState

class determined.searcher.SearcherState

Mutable Searcher state.

Search runners maintain this state that can be used by a SearchMethod to inform event handling. In other words, this state can be taken into account when deciding which operations to return from your event handler. Do not modify SearcherState in your SearchMethod. If your hyperparameter tuning algorithm needs additional state variables, add those variable to your SearchMethod implementation.

failures

number of failed trials

Type

Set[uuid.UUID]

trial_progress

progress of each trial as a number between 0.0 and 1.0

Type

Dict[uuid.UUID, float]

trials_closed

set of completed trials

Type

Set[uuid.UUID]

trials_created

set of created trials

Type

Set[uuid.UUID]

determined.searcher.Operation

class determined.searcher.Operation

Abstract base class for all Operations

determined.searcher.Close

class determined.searcher.Close(request_id: uuid.UUID)

Operation closing the specified trial

determined.searcher.Create

class determined.searcher.Create(request_id: uuid.UUID, hparams: Dict[str, Any], checkpoint: Optional[determined.common.experimental.checkpoint._checkpoint.Checkpoint])

Operation creating a trial with a specified combination of hyperparameter values

determined.searcher.ValidateAfter

class determined.searcher.ValidateAfter(request_id: uuid.UUID, length: int)

Operation signaling the trial to train until its total units trained equals the specified length, where the units (batches, epochs, etc.) are specified in the searcher section of the experiment configuration

determined.searcher.Shutdown

class determined.searcher.Shutdown

Operation shutting the experiment down

determined.searcher.ExitedReason

class determined.searcher.ExitedReason(value)

An enumeration.