Environment Configuration¶
Determined launches workloads using Docker containers. The container configuration is referred to as the environment.
There are three methods to customize the environment that workloads execute in:
Environment variables
Specifying a startup hook (
startup-hook.sh
)Using a custom Docker image
Environment Variables¶
For both trial runners and commands, Determined allows users to configure the
environment variables inside the container through the
environment.environment_variables
configuration field of the
experiment config. The format is a list of strings in the format
NAME=VALUE
:
environment:
environment_variables:
- A=hello world
- B=$A
- C=${B}
# `A`, `B`, and `C` will each have the value `hello_world` in the container.
Variables are set sequentially, which affect variables that depend on the expansion of other variables.
Proxy variables set in this way will take precedent over those set using the agent configuration.
Startup Hooks¶
If a file named startup-hook.sh
exists at the top level of your model
definition directory, Determined will automatically execute this file during the
startup of every Docker container. This occurs before any Python interpreters
are launched or any deep learning operations are performed; this allows the
startup hook to customize the container environment, install additional
dependencies, download data sets, or do practically anything else that you can
do in a shell script.
Note
Startup hooks are not cached and run before the start of every workload. Hence, performing expensive or long-running operations in a startup hook can result in poor performance.
Here is an example of a startup hook that installs the wget
utility and the
Python package pandas
:
apt-get update && apt-get install -y wget
python3.6 -m pip install pandas
The Iris example
contains a TensorFlow
Keras model that uses a startup hook to install an additional Python dependency.
Container Images¶
Determined provides a set of officially supported Docker images. These are the default images used to launch containers for experiments, commands, and any other workflow in the Determined system.
Default Images¶
In the current version of Determined, the experiments and commands are executed in containers with the following:
Ubuntu 18.04
CUDA 10.0
Python 3.6.9
TensorFlow 1.15.0
PyTorch 1.4.0
Determined will automatically select GPU-specific versions of each library when running on agents with GPUs.
In addition to the above settings, all trial runner containers are launched with additional Determined-specific harness code that orchestrates model training and evaluation in the container. Trial runner containers are also loaded with the experiment’s model definition and values of the hyperparameters for the current trial.
Note
The default images are
determinedai/environments:cuda-10.0-pytorch-1.4-tf-1.15-gpu-0.5.0
and
determinedai/environments:py-3.6.9-pytorch-1.4-tf-1.15-cpu-0.5.0
for GPU
and CPU respectively.
TensorFlow 2 Images¶
Determined also supports TensorFlow 2.2 and has a Docker image you can use for experiments and commands containing the following:
Ubuntu 18.04
CUDA 10.1
Python 3.6.9
TensorFlow 2.2.0
PyTorch 1.4.0
This can be configured in your experiment configuration like below:
environment:
image:
gpu: "determinedai/environments:cuda-10.1-pytorch-1.4-tf-2.2-gpu-0.5.0"
cpu: "determinedai/environments:py-3.6.9-pytorch-1.4-tf-2.2-cpu-0.5.0"
Custom Images¶
While the official images contain all the dependencies needed for basic deep
learning workloads, many workloads have extra dependencies. If those extra
dependencies are quick to install, you may want to consider using a
startup hook. For situations where installing
dependencies via startup-hook.sh
would take too long, we suggest building
your own Docker image and publishing to a Docker registry like Docker Hub. We recommend that custom images use one of the
official Determined images as a base image (using the FROM
instruction).
Warning
It is important to not install the TensorFlow, PyTorch, Horovod, or Apex packages as doing so will conflict with the base packages that are installed into Determined’s official environments.
Here is an example of a Dockerfile
that installs both conda
- and
pip
-based dependencies.
FROM determinedai/environments:cuda-10.0-pytorch-1.4-tf-1.15-gpu-0.5.0
RUN apt-get update && apt-get install -y unzip python-opencv graphviz
COPY environment.yml /tmp/environment.yml
COPY pip_requirements.txt /tmp/pip_requirements.txt
RUN conda env update --name base --file /tmp/environment.yml && \
conda clean --all --force-pkgs-dirs --yes
RUN eval "$(conda shell.bash hook)" && \
conda activate base && \
pip install --requirement /tmp/pip_requirements.txt
Assuming this image has been published to a public repository on Docker Hub, you can configure an experiment, command, or notebook to use the image as follows:
environment:
image: "my-user-name/my-repo-name:my-tag"
where my-user-name
is your Docker Hub user, my-repo-name
is the name of
the Docker Hub repository, and my-tag
is the image tag to use (e.g.,
latest
).
If your image has been published to a private Docker Hub repository, you can also specify the credentials to use to access the repository:
environment:
image: "my-user-name/my-repo-name:my-tag"
registry_auth:
username: my-user-name
password: my-password
If your image has been published to a private Docker Registry, specify the registry path as part of the
image
field:
environment:
image: "myregistry.local:5000/my-user-name/my-repo-name:my-tag"
Images will be fetched via HTTPS by default. An HTTPS proxy can be configured
using the https_proxy
field as part of the agent configuration.