Custom Environment

Determined launches workloads using Docker containers. By default, workloads execute inside a Determined-provided container that includes common deep learning libraries and frameworks. So you do not need to do anything if you do not need to use custom environments.

If your model code has additional dependencies, the easiest way to install them is to specify a startup-hook.sh. For more complex dependencies, you can also use a custom Docker image.

Environment Variables

For both trial runners and commands, Determined allows users to configure the environment variables inside the container through the environment.environment_variables configuration field of the experiment or task config. The format is a list of strings in the format NAME=VALUE:

environment:
  environment_variables:
    - A=hello world
    - B=$A
    - C=${B}
    # `A`, `B`, and `C` will each have the value `hello_world` in the container.

Variables are set sequentially, which affect variables that depend on the expansion of other variables.

Proxy variables set in this way will take precedent over those set using the agent configuration.

Startup Hooks

If a file named startup-hook.sh exists at the top level of your model definition directory, Determined will automatically execute this file during the startup of every Docker container. This occurs before any Python interpreters are launched or any deep learning operations are performed; this allows the startup hook to customize the container environment, install additional dependencies, download data sets, or do practically anything else that you can do in a shell script.

Note

Startup hooks are not cached and run before the start of every workload. Hence, performing expensive or long-running operations in a startup hook can result in poor performance.

Here is an example of a startup hook that installs the wget utility and the Python package pandas:

apt-get update && apt-get install -y wget
python3 -m pip install pandas

The Iris example contains a TensorFlow Keras model that uses a startup hook to install an additional Python dependency.

Container Images

Determined provides a set of officially supported Docker images. These are the default images used to launch containers for experiments, commands, and any other workflow in the Determined system.

Default Images

In the current version of Determined, experiments and tasks are executed in containers with the following:

  • Ubuntu 18.04

  • CUDA 11.1

  • Python 3.7

  • TensorFlow 2.4.2

  • PyTorch 1.9.0

Determined will automatically select GPU-specific versions of each library when running on agents with GPUs.

In addition to the above settings, all trial runner containers are launched with additional Determined-specific harness code that orchestrates model training and evaluation in the container. Trial runner containers are also loaded with the experiment’s model definition and values of the hyperparameters for the current trial.

Note

The default images are determinedai/environments:cuda-11.1-pytorch-1.9-lightning-1.3-tf-2.4-gpu-0.17.2 and determinedai/environments:py-3.8-pytorch-1.9-lightning-1.3-tf-2.4-cpu-0.17.2 for GPU and CPU respectively.

Older Images

Images that provide older versions of the frameworks are still available and supported. Note that the performance of some models can vary with different CUDA versions.

  • determinedai/environments:py-3.6.9-pytorch-1.4-tf-1.15-cpu-067db2b

  • determinedai/environments:py-3.6.9-pytorch-1.4-tf-2.2-cpu-067db2b

  • determinedai/environments:cuda-10.0-pytorch-1.4-tf-1.15-gpu-067db2b

  • determinedai/environments:cuda-10.1-pytorch-1.4-tf-2.2-gpu-067db2b

Custom Images

While the official images contain all the dependencies needed for basic deep learning workloads, many workloads have extra dependencies. If those extra dependencies are quick to install, you may want to consider using a startup hook. For situations where installing dependencies via startup-hook.sh would take too long, we suggest building your own Docker image and publishing to a Docker registry like Docker Hub.

Warning

It is important to not install the TensorFlow, PyTorch, Horovod, or Apex packages as doing so will conflict with the base packages that are installed into Determined’s official environments.

We recommend that custom images use one of the official Determined images as a base image (using the FROM instruction). Here is an example of a Dockerfile that installs custom conda-, pip- and apt-based dependencies.

# Determined Image
FROM determinedai/environments:cuda-11.1-pytorch-1.9-lightning-1.3-tf-2.4-gpu-0.17.2

# Custom Configuration
RUN apt-get update && \
   DEBIAN_FRONTEND="noninteractive" apt-get -y install tzdata && \
   apt-get install -y unzip python-opencv graphviz
COPY environment.yml /tmp/environment.yml
COPY pip_requirements.txt /tmp/pip_requirements.txt
RUN conda env update --name base --file /tmp/environment.yml
RUN conda clean --all --force-pkgs-dirs --yes
RUN eval "$(conda shell.bash hook)" && \
   conda activate base && \
   pip install --requirement /tmp/pip_requirements.txt

Assuming this image has been published to a public repository on Docker Hub, you can configure an experiment, command, or notebook to use the image as follows:

environment:
  image: "my-user-name/my-repo-name:my-tag"

where my-user-name is your Docker Hub user, my-repo-name is the name of the Docker Hub repository, and my-tag is the image tag to use (e.g., latest).

If your image has been published to a private Docker Hub repository, you can also specify the credentials to use to access the repository:

environment:
  image: "my-user-name/my-repo-name:my-tag"
  registry_auth:
    username: my-user-name
    password: my-password

If your image has been published to a private Docker Registry, specify the registry path as part of the image field:

environment:
  image: "myregistry.local:5000/my-user-name/my-repo-name:my-tag"

Images will be fetched via HTTPS by default. An HTTPS proxy can be configured using the https_proxy field as part of the agent configuration.

Your custom image and credentials can also be set as the defaults for all tasks launched in Determined. This can be done under image and registry_auth in the Master Configuration. Please note that for this to take effect you will have to restart the master.

Virtual Environments

Virtual environments are commonly used by model developers.

To configure virtual environments using custom images, see an example below:

# Determined Image
FROM determinedai/environments:py-3.8-pytorch-1.9-lightning-1.3-tf-2.4-cpu-0.17.2

# Create a virtual environment
RUN conda create -n myenv python=3.6
RUN eval "$(conda shell.bash hook)" && \
   conda activate myenv && \
   pip install scikit-learn

# Set the default virtual environment
RUN echo 'eval "$(conda shell.bash hook)" && conda activate myenv' >> ~/.bashrc

Note

If we need to ensure the desired virtual environment is activated every time we create a new interactive terminal session in JupyterLab or using Determined Shell, we should update ~/.bashrc with the scripts to activate the desired virtual environment.

To switch to a virtual environment using startup hook, see an example below:

# Switch to the desired virtual environment
eval "$(conda shell.bash hook)"
conda activate myenv

# Do that for every new interactive terminal session
echo 'eval "$(conda shell.bash hook)" && conda activate myenv' >> ~/.bashrc