PyTorch Model Definition

There are three steps needed to define a PyTorch model in PEDL:

  1. If downloading data, define a download_data() function. See Data Downloading for more information.

  2. Define a make_data_loaders() function. See Data Loading for more information.

  3. Implement the PyTorch Interface.

Data Loading

Loading data into PyTorchTrial models is done by defining a make_data_loaders() function. This function must return a pair of objects (one for training and one for validation); both objects should be instances of pedl.frameworks.pytorch.data.DataLoader. pedl.frameworks.pytorch.data.DataLoader behaves the same as torch.utils.data.DataLoader and is a drop-in replacement.

Function signature: make_data_loaders(experiment_config: Dict[str, Any], hparams: Dict[str, Any]) -> Tuple[pedl.frameworks.pytorch.data.DataLoader, pedl.frameworks.pytorch.data.DataLoader]

Each DataLoader is allowed to return batches with arbitrary structures of the following types, which will be fed directly to the train_batch and evaluate_batch functions:

  • np.ndarray

np.array([[0, 0], [0, 0]])
  • torch.Tensor

torch.Tensor([[0, 0], [0, 0]])
  • tuple of np.ndarrays or torch.Tensors

(torch.Tensor([0, 0]), torch.Tensor([[0, 0], [0, 0]]))
  • list of np.ndarrays or torch.Tensors

[torch.Tensor([0, 0]), torch.Tensor([[0, 0], [0, 0]])]
  • dictionary mapping strings to np.ndarrays or torch.Tensors

{"data": torch.Tensor([[0, 0], [0, 0]]), "label": torch.Tensor([[1, 1], [1, 1]])}
  • combination of the above

{
    "data": [
        {"sub_data1": torch.Tensor([[0, 0], [0, 0]])},
        {"sub_data2": torch.Tensor([0, 0])},
    ],
    "label": (torch.Tensor([0, 0]), torch.Tensor([[0, 0], [0, 0]])),
}

PyTorch Interface

class pedl.frameworks.pytorch.pytorch_trial.PyTorchTrial(hparams: Dict[str, Any], *_: Any, **__: Any)

PyTorch trials are created by subclassing the abstract class PyTorchTrial. Users must define all abstract methods to create the deep learning model associated with a specific trial, and to subsequently train and evaluate it.

abstract build_model() → torch.nn.modules.module.Module

Defines the deep learning architecture associated with a trial, which typically depends on the trial’s specific hyperparameter settings stored in the hparams dictionary. This method returns the model as an an instance or subclass of nn.Module.

abstract optimizer(model: torch.nn.modules.module.Module) → torch.optim.optimizer.Optimizer

Describes the optimizer to be used during training of the given model, an instance of torch.optim.Optimizer.

abstract train_batch(batch: Union[Dict[str, torch.Tensor], Sequence[torch.Tensor], torch.Tensor], model: torch.nn.modules.module.Module, epoch_idx: int, batch_idx: int) → Union[torch.Tensor, Dict[str, Any]]

Calculate the loss for a batch and return it in a dictionary. batch_idx represents the total number of batches processed per device (slot) since the start of training.

evaluate_batch(batch: Union[Dict[str, torch.Tensor], Sequence[torch.Tensor], torch.Tensor], model: torch.nn.modules.module.Module) → Dict[str, Any]

Calculate evaluation metrics for a batch and return them as a dictionary mapping metric names to metric values.

There are two ways to specify evaluation metrics. Either override evaluate_batch() or evaluate_full_dataset(). While evaluate_full_dataset() is more flexible, evaluate_batch() should be preferred, since it can be parallelized in distributed environments, whereas evaluate_full_dataset() cannot. Only one of evaluate_full_dataset() and evaluate_batch() should be overridden by a trial.

evaluation_reducer() → Union[pedl.frameworks.pytorch.reducer.Reducer, Dict[str, pedl.frameworks.pytorch.reducer.Reducer]]

Return a reducer for all evaluation metrics, or a dict mapping metric names to individual reducers. Defaults to pedl.frameworks.pytorch.reducer.Reducer.AVG.

evaluate_full_dataset(data_loader: torch.utils.data.dataloader.DataLoader, model: torch.nn.modules.module.Module) → Dict[str, Any]

Calculate validation metrics on the entire validation dataset and return them as a dictionary mapping metric names to reduced metric values (i.e., each returned metric is the average or sum of that metric across the entire validation set).

This validation can not be distributed and is performed on a single device, even when multiple devices (slots) are used for training. Only one of evaluate_full_dataset() and evaluate_batch() should be overridden by a trial.

create_lr_scheduler(optimizer: torch.optim.optimizer.Optimizer) → Optional[pedl.frameworks.pytorch.lr_scheduler.LRScheduler]

Create a learning rate scheduler for the trial given an instance of the optimizer.

Parameters

optimizer (torch.optim.Optimizer) – instance of the optimizer to be used for training

Returns

Wrapper around a torch.optim.lr_scheduler._LRScheduler.

Return type

pedl.frameworks.pytorch.lr_scheduler.LRScheduler

class pedl.frameworks.pytorch.lr_scheduler.LRScheduler(scheduler: torch.optim.lr_scheduler._LRScheduler, step_mode: pedl.frameworks.pytorch.lr_scheduler.LRScheduler.StepMode)
class StepMode

Specifies when and how scheduler.step() should be executed.

STEP_EVERY_EPOCH
STEP_EVERY_BATCH
MANUAL_STEP
__init__(scheduler: torch.optim.lr_scheduler._LRScheduler, step_mode: pedl.frameworks.pytorch.lr_scheduler.LRScheduler.StepMode)

Wrapper for a PyTorch LRScheduler.

Usage of this wrapper is required to properly scheduler the optimizer’s learning rate.

This wrapper fulfills two main functions:
  1. Save and restore of the learning rate in case a trial is paused, preempted, etc.

  2. Step the learning rate scheduler for predefined frequencies (every batch or every epoch).

Parameters
  • scheduler (torch.optim.lr_scheduler._LRScheduler) – Learning rate scheduler to be used by PEDL.

  • step_mode (pedl.frameworks.pytorch.lr_scheduler.LRSchedulerStepMode) –

    The strategy PEDL will use to call (or not call) scheduler.step().

    1. STEP_EVERY_EPOCH: PEDL will call scheduler.step() after every training epoch. No arguments will be passed to step().

    2. STEP_EVERY_BATCH: PEDL will call scheduler.step() after every training batch. No arguments will be passed to step().

    3. MANUAL_STEP: PEDL will not call scheduler.step() at all. It is up to the user to decide when to call scheduler.step(), and whether to pass any arguments.

get_lr() → List

Compute the current learning rate of the scheduler.

This function is equivalent to calling get_lr() on the wrapped LRScheduler.

step(*args: Any, **kwargs: Any) → None

Call step() on the wrapped LRScheduler instance.

class pedl.frameworks.pytorch.reducer.Reducer

The available methods for reducing metrics available to users.

AVG
SUM
MAX
MIN

Examples