General Tips for the Trial Definition¶
Use framework abstractions to implement learning rate scheduling instead of directly changing the learning rate. See tf.keras.optimizers.schedules.LearningRateSchedule and
For code that needs to download artifacts (e.g., data, configurations, pretrained weights), download to a tempfile.TemporaryDirectory unique to the Python process. This will avoid race conditions when using distributed training, in which Determined executes multiple Python processes in the same task container.
Do not use instance attributes on a trial class to save any state over time (e.g., storing metric
history in a
self attribute). The
Trial instance will only save and restore model weights
and optimizer state over time;
self attributes may be reset to their initial state at any time
if the Determined cluster reschedules the trial to another task container.
Separate Configuration from Code¶
We encourage a clean separation of code from configuration via the experiment configuration. Specifically, you are encouraged to use the pre-defined fields in
the experiment configuration, such as the
resources. This not only allows you to reuse the trial definition when you tune different
configuration fields but also improve the visualibility because those fields can be browsed in our
Move any hardcoded filesystem paths (e.g.,
/data/train.csv) to the
datafield of the experiment configuration. Use
context.get_data_config()to reference them in code.
Do not use global variables in your model definition; consider moving them to the experiment configuration.
We encourage tracking the dependencies associated with every workflow via the environment field. Understanding and standardizing the environment you use to execute Python in your development environment will pay off dividends in portability, allowing you to flexibly move between local, cloud, and on-premise cluster environments.
Ramp up quickly by using our default environment Docker image, optionally specifying additional PyPI dependencies by using
As your dependencies increase in complexity, invest in building and using a custom Docker image that meets your needs.
Pin Python package dependencies to specific versions (e.g.,
<package>==<version>) in build tools.
Do not modify the
PATH environment variables to import libraries by
circumventing the Python packaging system.