Skip to content

Custom Trial Environments

Trial Runners

A trial runner is a Docker container that provides an isolated environment for running deep learning workloads. A trial runner consists of a set of Python libraries and some PEDL-specific harness code that orchestrates running workloads inside the container. When the container is launched, PEDL injects the current experiment's model definition, along with the values of the hyperparameters for the current trial.

In the current version of PEDL, all experiments use one of two base container images:

  • determinedai/pedl-tr-py3.6-tf, which includes Python 3.6.8, TensorFlow 1.12.0, Keras 2.2.4, and NumPy 1.15.0. This image is selected by default if the experiment configuration does not specify base_image.
  • determinedai/pedl-tr-py3.6-pytorch, which includes Python 3.6.8, PyTorch 0.4.1, torchvision 0.2.1, and NumPy 1.15.0.

PEDL will automatically select a GPU-enabled version of the image when running on agents with GPUs. The trial environment can be further customized by running an arbitrary list of "runtime" commands and installing additional Python packages, as described below.

Custom Trial Environments

Public Packages

If the model definition has publicly accessible dependencies that are not included in the base trial runner image, the experiment config file provides two options for adding to the trial runner environment: runtime_packages and runtime_commands. Before running the experiment, PEDL will generate and cache a new trial runner container with the requested packages installed and commands executed.

Here is an example experiment configuration that uses the runtime_packages option:

trial_environment:
  base_image: determinedai/pedl-tr-py3.6-tf
  runtime_packages:
    - pandas
    - numpy

This instructs PEDL to install these packages (via pip) into the trial container before running any workloads.

Packages can additionally be restricted with version specifiers or installed from custom external sources using the standard pip install syntax:

trial_environment:
  base_image: determinedai/pedl-tr-py3.6-tf
  runtime_packages:
    - pandas==0.20.3 # Install this exact version
    - pandas>=0.20.0 # Install a version greater than or equal to this version
    - pandas!=0.20.1 # Exclude this version from installation
    - git+https://github.com/pandas-dev/pandas # Installing from GitHub

runtime_packages makes it easy to customize the trial environment to include additional Python packages. For more complex changes to the trial environment (e.g., installing native libraries), users can specify the runtime_commands option:

trial_environment:
  base_image: determinedai/pedl-tr-py3.6-tf
  runtime_commands:
    - echo "Installing python3-pandas via apt-get"
    - apt-get install python3-pandas

The contents of runtime_commands will be executed in the order provided and before any runtime_packages are installed.

runtime_commands and runtime_packages support installing different dependencies for GPU vs. CPU environments. For example:

trial_environment:
  base_image: determinedai/pedl-tr-py3.6-tf
  runtime_packages:
    cpu:
      - tensorflow
    gpu:
      - tensorflow-gpu
Local Packages

PEDL also provides the option to install dependencies from a user's local environment, as long as the local package is a Python source distribution file, or a Python wheel file that supports installation into a linux_x86_64 and py3 environment. To use a local package dependency, specify the --package flag when creating the experiment with pedl experiment create:

$ pedl experiment create config.yaml model_def.py \
    --package user-package-1.tar.gz \
    --package user-package-2.tar.gz

The command above will install user-package-1.tar.gz and user-package-2.tar.gz on all trial runners in the experiment. There is no limit on the number of packages that can be specified. However the total size of the model definition and all packages must be under 96 MB.

Trial Runner Environment

PEDL allows users to configure the environment used by trial runner containers. If the model definition directory contains a file named pedl-prepare-env.sh, this file will be executed (via source in bash) before the trial runner's Python environment is started. This feature can be used to set environmental variables or run commands that prepare the trial environment.

Custom Docker Base Images

Warning

This is currently an experimental feature.

PEDL supports running trial containers that are based on user-provided Docker images. The base_image should be accessible to all agent nodes via docker pull. If a private image is used, Docker Registry credentials must be specified in the registry_auth section in the experiment configuration.

When provided with a custom base image, PEDL will build a runtime image that imports the base_image with a FROM instruction, injects the Python PEDL source code with COPY instructions, and installs Python requirement libraries with pip. The maintainer of the custom base image is responsible for installing the following PEDL dependencies:

  • Git >= 1.5.0
  • Python 3.6.8 installed as python3.6
  • Pip installed as pip
  • CUDA 9.0
  • CuDNN 7.4
  • Framework-specific libraries, depending on the interface used:
    • TensorFlow 1.12.0 if using TensorFlow or Keras APIs
    • PyTorch 0.4.1 if using the PyTorch API

If you would like to use versions of libraries different from those specified above in a custom base image, please contact the Determined AI team for a consultation.