Custom Environment¶
Determined launches workloads using Docker containers. By default, workloads execute inside a Determined-provided container that includes common deep learning libraries and frameworks.
If your model code has additional dependencies, the easiest way to install them is to specify a startup hook. For more complex dependencies, you can also use a custom Docker image.
If you’re using Determined on Kubernetes, review the supplementary Custom Pod Specs guide.
Environment Variables¶
For both trial runners and commands, Determined allows users to configure the environment variables
inside the container through the environment.environment_variables
configuration field of the
experiment or task config.
The format is a list of strings in the format NAME=VALUE
:
environment:
environment_variables:
- A=hello world
- B=$A
- C=${B}
# `A`, `B`, and `C` will each have the value `hello_world` in the container.
Variables are set sequentially, which affect variables that depend on the expansion of other variables.
Proxy variables set in this way will take precedent over those set using the agent configuration.
It is also possible to set these variables for each accelerator type separately:
environment:
environment_variables:
cpu:
- A=hello x86
gpu:
- A=hello nvidia
rocm:
- A=hello amd
Startup Hooks¶
If a file named startup-hook.sh
exists at the top level of your model definition directory,
Determined will automatically execute this file during the startup of every Docker container. This
occurs before any Python interpreters are launched or any deep learning operations are performed;
this allows the startup hook to customize the container environment, install additional
dependencies, download data sets, or do practically anything else that you can do in a shell script.
Note
Startup hooks are not cached and run before the start of every workload. Hence, performing expensive or long-running operations in a startup hook can result in poor performance.
Here is an example of a startup hook that installs the wget
utility and the Python package
pandas
:
apt-get update && apt-get install -y wget
python3 -m pip install pandas
The Iris example
contains a TensorFlow Keras model that
uses a startup hook to install an additional Python dependency.
Container Images¶
Determined provides a set of officially supported Docker images. These are the default images used to launch containers for experiments, commands, and any other workflow in the Determined system.
Default Images¶
In the current version of Determined, experiments and tasks are executed in containers with the following:
Ubuntu 18.04
CUDA 11.1
Python 3.8.x
TensorFlow 2.4.x
PyTorch 1.9.x
Determined will automatically select GPU-specific versions of each library when running on agents with GPUs.
In addition to the above settings, all trial runner containers are launched with additional Determined-specific harness code that orchestrates model training and evaluation in the container. Trial runner containers are also loaded with the experiment’s model definition and values of the hyperparameters for the current trial.
Note
The default images are
determinedai/environments:cuda-11.3-pytorch-1.10-lightning-1.5-tf-2.8-gpu-0.17.15
and
determinedai/environments:py-3.8-pytorch-1.10-lightning-1.5-tf-2.8-cpu-0.17.15
for GPU and
CPU respectively.
Older Images¶
Images that provide older versions of the frameworks are still available and supported. Note that the performance of some models can vary with different CUDA versions.
determinedai/environments:py-3.6.9-pytorch-1.4-tf-1.15-cpu-067db2b
determinedai/environments:py-3.6.9-pytorch-1.4-tf-2.2-cpu-067db2b
determinedai/environments:cuda-10.0-pytorch-1.4-tf-1.15-gpu-067db2b
determinedai/environments:cuda-10.1-pytorch-1.4-tf-2.2-gpu-067db2b
Custom Images¶
While the official images contain all the dependencies needed for basic deep learning workloads,
many workloads have extra dependencies. If those extra dependencies are quick to install, you may
want to consider using a startup hook. For situations where installing
dependencies via startup-hook.sh
would take too long, we suggest building your own Docker image
and publishing to a Docker registry like Docker Hub.
Warning
It is important to not install the TensorFlow, PyTorch, Horovod, or Apex packages as doing so will conflict with the base packages that are installed into Determined’s official environments.
We recommend that custom images use one of the official Determined images as a base image (using the
FROM
instruction). Here is an example of a Dockerfile that installs custom conda
-, pip
-
and apt
-based dependencies.
# Determined Image
FROM determinedai/environments:cuda-11.3-pytorch-1.10-lightning-1.5-tf-2.8-gpu-0.17.15
# Custom Configuration
RUN apt-get update && \
DEBIAN_FRONTEND="noninteractive" apt-get -y install tzdata && \
apt-get install -y unzip python-opencv graphviz
COPY environment.yml /tmp/environment.yml
COPY pip_requirements.txt /tmp/pip_requirements.txt
RUN conda env update --name base --file /tmp/environment.yml
RUN conda clean --all --force-pkgs-dirs --yes
RUN eval "$(conda shell.bash hook)" && \
conda activate base && \
pip install --requirement /tmp/pip_requirements.txt
Assuming this image has been published to a public repository on Docker Hub, you can configure an experiment, command, or notebook to use the image as follows:
environment:
image: "my-user-name/my-repo-name:my-tag"
where my-user-name
is your Docker Hub user, my-repo-name
is the name of the Docker Hub
repository, and my-tag
is the image tag to use (e.g., latest
).
If your image has been published to a private Docker Hub repository, you can also specify the credentials to use to access the repository:
environment:
image: "my-user-name/my-repo-name:my-tag"
registry_auth:
username: my-user-name
password: my-password
If your image has been published to a private Docker Registry, specify the registry path as part of the image
field:
environment:
image: "myregistry.local:5000/my-user-name/my-repo-name:my-tag"
Images will be fetched via HTTPS by default. An HTTPS proxy can be configured using the
https_proxy
field as part of the agent configuration.
Your custom image and credentials can also be set as the defaults for all tasks launched in
Determined. This can be done under image
and registry_auth
in the
Master Configuration. Please note that for this to take effect you will have to restart the
master.
Virtual Environments¶
Virtual environments are commonly used by model developers.
To configure virtual environments using custom images, see an example below:
# Determined Image
FROM determinedai/environments:py-3.8-pytorch-1.10-lightning-1.5-tf-2.8-cpu-0.17.15
# Create a virtual environment
RUN conda create -n myenv python=3.6
RUN eval "$(conda shell.bash hook)" && \
conda activate myenv && \
pip install scikit-learn
# Set the default virtual environment
RUN echo 'eval "$(conda shell.bash hook)" && conda activate myenv' >> ~/.bashrc
Note
If we need to ensure the desired virtual environment is activated every time we create a new
interactive terminal session in JupyterLab or using Determined Shell, we should update
~/.bashrc
with the scripts to activate the desired virtual environment.
To switch to a virtual environment using startup hook, see an example below:
# Switch to the desired virtual environment
eval "$(conda shell.bash hook)"
conda activate myenv
# Do that for every new interactive terminal session
echo 'eval "$(conda shell.bash hook)" && conda activate myenv' >> ~/.bashrc