YAML is a markup language often used for configuration. Determined uses YAML for configuring tasks such as experiments and notebooks, as well as configuring the Determined cluster as a whole. This guide describes a subset of YAML that we recommend for use with Determined. This is not a full description of YAML; see the specification or other online guides for more details.
A value in YAML can be a scalar (
null or a number, string, or
Boolean) or a collection (an array or map). Collections can contain
other collections nested to any depth (though Determined’s YAML files
generally have a fairly fixed structure).
A comment in a YAML file starts with a
# character and extends to
the end of the line.
If you are familiar with JSON, you can think of YAML as an alternative way of expressing JSON objects that is meant to be easier for humans to read and write, since it allows comments and has fewer markup characters around the content.
Maps represent unordered mappings from strings to YAML values. A map is written as a sequence of key-value pairs. Each key is followed by a colon and the corresponding value. The value can be on the same line as the key if it is a scalar (in which case it must be preceded by a space) or on subsequent lines (in which case it must be indented, conventionally by two spaces).
We use a map in the experiment configuration to configure hyperparameters:
hyperparameters: base_learning_rate: 0.001 weight_cost: 0.0001 global_batch_size: 64 n_filters1: 40 n_filters2: 40
The snippet above describes a map with one key,
corresponding value is itself a map whose keys are
An array contains multiple other YAML values in some order. An array is written as a sequence of values, each one preceded by a hyphen and a space. The hyphens for one list must all be indented by the same amount.
We use an array in the experiment configuration to configure environment variables:
environment: environment_variables: - A=A - B=B - C=C
Scalars generally behave naturally:
"foo" all have the same meanings that they would in JSON (and many
programming languages). However, YAML allows strings to be unquoted:
foo is the same as
"foo". This behavior is often convenient, but
it can lead to unexpected behavior when small edits to a value change
its type. For example, the following YAML block represents a list
containing several values whose types are listed in the comments:
- true # Boolean - grue # string - 0.0 # number - 0.0. # string - foo: bar # map - foo:bar # string - foo bar # string
Putting It All Together¶
A Determined configuration file consists of a YAML object with a particular structure: a map at the top level that is expected to have certain keys, with the value for each key expected to have a certain structure in turn.
In this example experiment configuration, we use numbers, strings, maps, and an array:
description: mnist_tf_const data: base_url: https://s3-us-west-2.amazonaws.com/determined-ai-datasets/mnist/ training_data: train-images-idx3-ubyte.gz training_labels: train-labels-idx1-ubyte.gz validation_set_size: 10000 hyperparameters: base_learning_rate: 0.001 weight_cost: 0.0001 global_batch_size: 64 n_filters1: 40 n_filters2: 40 searcher: name: single metric: error max_length: batches: 500 smaller_is_better: true environment: environment_variables: - A=A - B=B - C=C