PEDL launches trials of experiments, called "trial runners", and PEDL commands in customizable Docker containers. The configuration of the container is referred to as the environment.
In the current version of PEDL, the trial runners and commands are executed in containers with the following default settings:
- Ubuntu 16.04
- CUDA 10.0
- Python 3.6.9
- TensorFlow 1.13.1
- Pytorch 1.0.1.post2
- Keras 2.2.4
PEDL will automatically select GPU-specific versions of each library when running on agents with GPUs. The trial environment can be further customized by running an arbitrary list of "runtime" commands and installing additional Python packages, as described below.
In addition to the above settings, all trial runner containers are launched with additional PEDL-specific harness code that orchestrates model training and evaluation in the container. Trial runner containers are also loaded with the experiment's model definition and values of the hyperparameters for the current trial.
All keys and allowable fields for customizing the environment are listed under the
environment key in the experiment configuration reference.
If a trial runner or command container have publicly accessible dependencies that are not covered by the above dependencies, the experiment config file provides two options for adding dependencies to the environment:
runtime_commands. Before launching the trial or command, PEDL will generate and cache a new container with the requested packages installed and commands executed.
Here is an example experiment configuration that uses the
environment: runtime_packages: - pandas - numpy
This instructs PEDL to install these packages (via
pip) into the container before running any workloads.
Packages can additionally be restricted with version specifiers or installed from custom external sources using the standard
pip install syntax:
environment: runtime_packages: - pandas==0.20.3 # Install this exact version - pandas>=0.20.0 # Install a version greater than or equal to this version - pandas!=0.20.1 # Exclude this version from installation - git+https://github.com/pandas-dev/pandas # Installing from GitHub
runtime_packages makes it easy to customize the environment to include additional Python packages. For more complex changes to the environment (e.g., installing native libraries), users can specify the
environment: runtime_commands: - echo "Installing python3-pandas via apt-get" - apt-get install python3-pandas
The contents of
runtime_commands will be executed in the order provided and before any
runtime_packages are installed.
runtime_packages support installing different dependencies for GPU vs. CPU environments. For example:
environment: runtime_packages: cpu: - tensorflow gpu: - tensorflow-gpu
For trial runners, PEDL also provides the option to install dependencies from a user's local environment, as long as the local package is a Python source distribution file, or a Python wheel file that supports installation into a
py3 environment. To use a local package dependency, specify the
--package flag when creating the experiment with
pedl experiment create:
$ pedl experiment create config.yaml model_def.py \ --package user-package-1.tar.gz \ --package user-package-2.tar.gz
The command above will install
user-package-2.tar.gz on all trial runners in the experiment. There is no limit on the number of packages that can be specified. However the total size of the model definition and all packages must be under 96 MB.
For trial runners, PEDL allows users to configure the environment variables inside the container. If the model definition directory contains a file named
pedl-prepare-env.sh, this file will be executed (via
bash) before the trial runner's Python environment is started. This feature can be used to set environmental variables or run commands that prepare the trial environment.
Custom Docker Images¶
This is currently an experimental feature.
PEDL supports running trials and executing commands based on user-provided Docker images. The image in
custom_image should be accessible to all agent nodes via
docker pull. If a private image is used, Docker Registry credentials must be specified in the
registry_auth section in the experiment configuration.
When provided with a custom image, PEDL will build a runtime image that imports the
custom_image with a
FROM instruction, injects the Python PEDL source code with
COPY instructions, and installs Python requirement libraries with
pip. The maintainer of the custom image is responsible for installing the following PEDL dependencies:
- Git >= 1.5.0
- Python 3.6.9 installed as
- CUDA 10.0
- CuDNN 7.4
- Framework-specific libraries, depending on the interface used:
- TensorFlow 1.13.1 if using
- PyTorch 1.0.1.post2 if using the
- TensorFlow 1.13.1 if using
If you would like to use versions of libraries different from those specified above in a custom base image, please contact the Determined AI team for a consultation.