Configuration Templates

At a typical organization, many PEDL configuration files will contain similar settings. For example, all of the training workloads run at a given organization might use the same checkpoint storage or security configuration. One way to reduce this redundancy is to use configuration templates. With this feature, users can move settings that are shared by many experiments into a single YAML file that can then be referenced by configurations that require those settings.

Each configuration template has a unique name and is stored by the PEDL master. If a configuration specifies a template, the effective configuration of the task will be the result of merging the two YAML files (configuration file and template). The semantics of this merge operation is described below. PEDL stores this expanded configuration so that future changes to a template will not affect the reproducibility of experiments that used a previous version of the configuration template.

A single configuration file can use at most one configuration template. A configuration template cannot itself use another configuration template.

Working with Templates in the CLI

The PEDL command line interface can be used to list, create, update, and delete configuration templates. This functionality can be accessed through the pedl template sub-command. This command can be abbreviated as pedl tpl.

To list all the templates stored in PEDL, use pedl template list. You can also use the -d or --detail option to show additional details.

$ pedl tpl list
Name
-------------------------
template-s3-tf-gpu
template-s3-pytorch-gpu
template-s3-keras-gpu

To create or update a template, use pedl tpl set template_name template_file.

$ cat > template-s3-keras-gpu.yaml << EOL
description: template-s3-keras-gpu
environment:
  python: 3.6.9
  tensorflow: 1.14.0
  keras: 2.2.4
checkpoint_storage:
  type: s3
  access_key: my-access-key
  secret_key: my-secret-key
  bucket: determined-ai-examples
EOL
$ pedl tpl set template-s3-keras-gpu template-s3-keras-gpu.yaml
Set template template-s3-keras-gpu

Using Templates to Simplify Experiment Configurations

An experiment can use a configuration template by using the --template command line option to specify the name of the desired template.

Here is an example demonstrating how an experiment configuration can be split into a reusable template and a simplified configuration.

Consider the experiment configuration below:

description: mnist_tf_const
environment:
  runtime_packages:
    - pandas
    - scipy
  os: ubuntu16.04
  cuda: 10.0
  tensorflow: 1.14.0
checkpoint_storage:
  type: s3
  access_key: AKIAJGK24F32WWJ25AEA
  secret_key: 08OM18juqj9p2ivz0kBkxsFVuY/yoC9A0fDHmSeA
  bucket: determined-ai-examples
data:
  base_url: https://s3-us-west-2.amazonaws.com/determined-ai-datasets/mnist/
  training_data: train-images-idx3-ubyte.gz
  training_labels: train-labels-idx1-ubyte.gz
  validation_set_size: 10000
hyperparameters:
  base_learning_rate: 0.001
  weight_cost: 0.0001
  batch_size: 64
  n_filters1: 40
  n_filters2: 40
searcher:
  name: single
  metric: error
  max_steps: 5
  smaller_is_better: true

You may find that the values for the environment and checkpoint_storage fields are the same for many experiments and you want to use a configuration template to reduce the redundancy. In this example, such a template can be written as follows:

description: template-tf-gpu
environment:
  runtime_packages:
    - pandas
    - scipy
  os: ubuntu16.04
  cuda: 10.0
  tensorflow: 1.14.0
checkpoint_storage:
  type: s3
  access_key: AKIAJGK24F32WWJ25AEA
  secret_key: 08OM18juqj9p2ivz0kBkxsFVuY/yoC9A0fDHmSeA
  bucket: determined-ai-examples

Then for this specific experiment, the experiment configuration can be written as below:

description: mnist_tf_const
data:
  base_url: https://s3-us-west-2.amazonaws.com/determined-ai-datasets/mnist/
  training_data: train-images-idx3-ubyte.gz
  training_labels: train-labels-idx1-ubyte.gz
  validation_set_size: 10000
hyperparameters:
  base_learning_rate: 0.001
  weight_cost: 0.0001
  batch_size: 64
  n_filters1: 40
  n_filters2: 40
searcher:
  name: single
  metric: error
  max_steps: 5
  smaller_is_better: true

To launch the experiment with the template:

$ pedl experiment create --template template-tf-gpu mnist_tf_const.yaml <model_code>

Merge Behavior

Suppose we have a template that specifies top-level fields a and b and a configuration that specifies fields b and c. The merged configuration will have fields a, b, and c. The value for field a will simply be the value set in the template. Likewise, the value for field c will be whatever was specified in the configuration. The final value for field b, however, depends on the value's type:

  • If the field specifies string, integer, or float values, the merged value will be that which is specified by the configuration (the configuration overrides the template)
  • If the field specifies a list value, the merged value will be the concatenation of the list specified in the template and that specified in the configuration
  • If the field specifies an object value, the resulting value will be the object generated by recursively applying this merging algorithm to both objects.