Best Practices#

General Tips for the Trial Definition#

Do:

Note

To learn more about distributed training with Determined, visit the conceptual overview or the intro to implementing distributed training.

Do not use instance attributes on a trial class to save any state over time (e.g., storing metric history in a self attribute). The Trial instance will only save and restore model weights and optimizer state over time; self attributes may be reset to their initial state at any time if the Determined cluster reschedules the trial to another task container.

Separate Configuration from Code#

We encourage a clean separation of code from configuration via the experiment configuration. Specifically, you are encouraged to use the pre-defined fields in the experiment configuration, such as the searcher, hyperparameters, optimizations, and resources. This not only allows you to reuse the trial definition when you tune different configuration fields but also improve the visualibility because those fields can be browsed in our WebUI.

Do:

  • Move any hardcoded scalar values to the hyperparameters or data fields in the experiment configuration. Use context.get_hparam() or context.get_data_config() to reference them in code.

  • Move any hardcoded filesystem paths (e.g., /data/train.csv) to the data field of the experiment configuration. Use context.get_data_config() to reference them in code.

Do not use global variables in your model definition; consider moving them to the experiment configuration.

Understand Dependencies#

We encourage tracking the dependencies associated with every workflow via the environment field. Understanding and standardizing the environment you use to execute Python in your development environment will pay off dividends in portability, allowing you to flexibly move between local, cloud, and on-premise cluster environments.

Do:

  • Ramp up quickly by using our default environment Docker image, optionally specifying additional PyPI dependencies by using pip install in startup-hook.sh.

  • As your dependencies increase in complexity, invest in building and using a custom Docker image that meets your needs.

  • Pin Python package dependencies to specific versions (e.g., <package>==<version>) in build tools.

Do not modify the PYTHONPATH or PATH environment variables to import libraries by circumventing the Python packaging system.