determined.NativeContext¶

The NativeContext provides useful methods for writing tf.keras and tf.estimator experiments using the Native API. Every init() function supported by the Native API returns a subclass of NativeContext:

determined.keras.init() returns determined.keras.TFKerasNativeContext.
determined.estimator.init() returns determined.estimator.EstimatorNativeContext.

`determined.NativeContext`¶

class determined.NativeContext(env: determined._env_context.EnvContext, hvd_config: determined.horovod.HorovodContext)¶

A base class that all NativeContexts will inherit when using the Native API.

The context returned by the init() function must inherit from this class.

NativeContext always has a DistributedContext accessible via context.distributed for information related to distributed training.

get_data_config() → Dict[str, Any]¶: Return the data configuration.

get_experiment_config() → Dict[str, Any]¶: Return the experiment configuration.

get_experiment_id() → int¶: Return the experiment ID of the current trial.

get_global_batch_size() → int¶: Return the global batch size.

get_hparam(name: str) → Any¶: Return the current value of the hyperparameter with the given name.

get_hparams() → Dict[str, Any]¶: Return a dictionary of hyperparameter names to values.

get_per_slot_batch_size() → int¶: Return the per-slot batch size. When a model is trained with a single GPU, this is equal to the global batch size. When multi-GPU training is used, this is equal to the global batch size divided by the number of GPUs used to train the model.

get_trial_id() → int¶: Return the trial ID of the current trial.

`determined.TrialContext.distributed`¶

class determined._train_context.DistributedContext(env: determined._env_context.EnvContext, hvd_config: determined.horovod.HorovodContext)

DistributedContext extends all TrialContexts and NativeContexts under the context.distributed namespace. It provides useful methods for effective multi-slot (parallel and distributed) training.

get_rank() → int: Return the rank of the process in the trial.

get_local_rank() → int: Return the rank of the process on the agent.

get_size() → int: Return the number of slots this trial is running on.

get_num_agents() → int: Return the number of agents this trial is running on.

`determined.keras.TFKerasNativeContext`¶

class determined.keras.TFKerasNativeContext(env: determined._env_context.EnvContext, hvd_config: determined.horovod.HorovodContext)¶

TFKerasNativeContext always has a DistributedContext accessible via context.distributed for information related to distributed training.

get_data_config() → Dict[str, Any]¶: Return the data configuration.

get_experiment_config() → Dict[str, Any]¶: Return the experiment configuration.

get_experiment_id() → int¶: Return the experiment ID of the current trial.

get_global_batch_size() → int¶: Return the global batch size.

get_hparam(name: str) → Any¶: Return the current value of the hyperparameter with the given name.

get_hparams() → Dict[str, Any]¶: Return a dictionary of hyperparameter names to values.

get_per_slot_batch_size() → int¶: Return the per-slot batch size. When a model is trained with a single GPU, this is equal to the global batch size. When multi-GPU training is used, this is equal to the global batch size divided by the number of GPUs used to train the model.

get_trial_id() → int¶: Return the trial ID of the current trial.

wrap_dataset(dataset: Any) → Any¶

This should be used to wrap tf.data.Dataset objects immediately after they have been created. Users should use the output of this wrapper as the new instance of their dataset. If users create multiple datasets (e.g., one for training and one for testing), users should wrap each dataset independently.

Parameters: dataset – tf.data.Dataset

`determined.estimator.EstimatorNativeContext`¶

class determined.estimator.EstimatorNativeContext(env: determined._env_context.EnvContext, hvd_config: determined.horovod.HorovodContext)¶

EstimatorNativeContext always has a DistributedContext accessible via context.distributed for information related to distributed training.

get_data_config() → Dict[str, Any]¶: Return the data configuration.

get_experiment_config() → Dict[str, Any]¶: Return the experiment configuration.

get_experiment_id() → int¶: Return the experiment ID of the current trial.

get_global_batch_size() → int¶: Return the global batch size.

get_hparam(name: str) → Any¶: Return the current value of the hyperparameter with the given name.

get_hparams() → Dict[str, Any]¶: Return a dictionary of hyperparameter names to values.

get_per_slot_batch_size() → int¶: Return the per-slot batch size. When a model is trained with a single GPU, this is equal to the global batch size. When multi-GPU training is used, this is equal to the global batch size divided by the number of GPUs used to train the model.

get_trial_id() → int¶: Return the trial ID of the current trial.

wrap_dataset(dataset: Any) → Any¶: This should be used to wrap tf.data.Dataset objects immediately after they have been created. Users should use the output of this wrapper as the new instance of their dataset. If users create multiple datasets (e.g., one for training and one for testing), users should wrap each dataset independently. E.g., If users instantiate their training dataset within build_train_spec(), they should call dataset = wrap_dataset(dataset) prior to passing it into tf.estimator.TrainSpec.

wrap_optimizer(optimizer: Any) → Any¶: This should be used to wrap optimizer objects immediately after they have been created. Users should use the output of this wrapper as the new instance of their optimizer. For example, if users create their optimizer within build_estimator(), they should call optimizer = wrap_optimizer(optimzer) prior to passing the optimizer into their Estimator.

determined.NativeContext¶