Best Practices for Model Definitions¶
The deep learning software ecosystem is still in its early stages and changing rapidly. Due to the bleeding-edge nature of this field, many code-bases initially optimize for quick iteration and exploration over good software engineering practices. The Determined platform is designed to bring workflows from research to production and our APIs are designed with this goal in mind. This topic guide discusses some of the highly recommended best practices to follow when writing model definitions.
To learn more about the basics of model definitions, see Training: Implement Training APIs.
Separate Configuration from Code¶
Determined encourages a clean separation of code from configuration via the experiment
configuration. Structured properties like the
resources will be stored in the database and
browsable via our WebUI, improving the tracking and collaboration capabilities of the
Move any hardcoded filesystem paths (e.g.,
/data/train.csv) to the
datafield of the experiment configuration. Use
context.get_data_config()to reference them in code.
Use global variables in your model definition; consider moving them to the experiment configuration.
Determined encourages tracking the dependencies associated with every workflow via the environment field. Understanding and standardizing the environment you use to execute Python in your development environment will pay off dividends in portability, allowing you to flexibly move between local, cloud, and on-premise cluster environments.
Ramp up quickly by using our default environment Docker image, optionally specifying additional PyPI dependencies by using
As your dependencies increase in complexity, invest in building and using a custom Docker image that meets your needs.
Pin Python package dependencies to specific versions (e.g.,
<package>==<version>) in build tools.
PATHenvironment variables to import libraries by circumventing the Python packaging system.
Master the Framework¶
Determined APIs are designed to conform to the best practices of the frameworks that we support. Standardizing on the best practices of these frameworks will result in a smoother Determined user experience.
Stay up to date with the latest framework version.
Follow the best practices and examples laid out by the framework documentation.
Use deprecated or unsupported interfaces in application frameworks.
Use a custom forked version of the application framework. Determined will only support the official releases of frameworks.
Tips for Determined APIs¶
Use the Training: Debug guide if you run into issues with your Trial class.
Use framework abstractions to implement learning rate scheduling instead of directly changing the learning rate. See tf.keras.optimizers.schedules.LearningRateSchedule and
For code that needs to download artifacts (e.g., data, configurations, pretrained weights), download to a tempfile.TemporaryDirectory unique to the Python process. This will avoid race conditions when using distributed training, in which Determined executes multiple Python processes in the same task container.
Use instance attributes on a trial class to save any state over time (e.g., storing metric history in a
Trialinstance will only save and restore model weights and optimizer state over time;
selfattributes may be reset to their initial state at any time if the Determined cluster reschedules the trial to another task container.