Best Practices for Model Definitions¶
The deep learning software ecosystem is still in its early stages and changing rapidly. Due to the bleeding-edge nature of this field, many code-bases initially optimize for quick iteration and exploration over good software engineering practices. The Determined platform is designed to bring workflows from research to production and our APIs are designed with this goal in mind. This topic guide discusses some of the highly recommended best practices to follow when writing model definitions.
To learn more about the basics of model definitions, see Model Definitions.
Separate Configuration from Code¶
Determined encourages a clean separation of code from configuration via
the experiment configuration.
Structured properties like the searcher
, hyperparameters
,
optimizations
, and resources
will be stored in the database and
browsable via our WebUI, improving the tracking and
collaboration capabilities of the platform.
Do:
Move any hardcoded scalar values to the
hyperparameters
ordata
field in the experiment configuration. Usecontext.get_hyperparameters()
orcontext.get_data_config()
to reference them in code.Move any hardcoded filesystem paths (e.g.
/data/train.csv
) to thedata
field of the experiment configuration. Usecontext.get_data_config()
to reference them in code.
Do not:
Use global variables in your model definition; consider moving them to the experiment configuration.
Understand Dependencies¶
Determined encourages tracking the dependencies associated with every workflow via the environment field. Understanding and standardizing the environment you use to execute Python in your development environment will pay off dividends in portability, allowing you to flexibly move between local, cloud, and on-premise cluster environments.
Do:
Ramp up quickly by using our default environment Docker image, optionally specifying additional PyPI dependencies by using
pip install
instartup-hook.sh
.As your dependencies increase in complexity, invest in building and using a custom Docker image that meets your needs
Pin Python package dependencies to specific versions (e.g.,
<package>==<version>
) in build tools.
Do not:
Modify the
PYTHONPATH
orPATH
environment variables to import libraries by circumventing the Python packaging system.
Master the Framework¶
Determined APIs are designed to conform to the best practices of the frameworks that we support. Standardizing on the best practices of these frameworks will result in a smoother Determined user experience.
Do:
Stay up to date with the latest framework version.
Follow the best practices and examples laid out by the framework documentation.
Do not:
Use deprecated or unsupported interfaces in application frameworks.
Use a custom forked version of the application framework. Determined will only support the official releases of frameworks.
Tips for Determined APIs¶
Do:
Use framework abstractions to implement learning rate scheduling instead of directly changing the learning rate. See tf.keras.optimizers.schedules.LearningRateSchedule and determined.pytorch.LRScheduler as examples.
For code that needs to download artifacts (e.g. data, configurations, pretrained weights), download to a tempfile.TemporaryDirectory unique to the Python process. This will avoid race conditions when parallel training, in which Determined executes multiple Python processes in the same task container.
Do not:
Use instance attributes on trial class to save any state over time (e.g. storing metric history in a
self
attribute). TheTrial
instance will only save and restore model weights and optimizer state over time;self
attributes may be reset to their initial state at any time if the Determined cluster reschedules the trial to another task container.Use “early-stopping” framework callbacks, such as tf.keras.callbacks.EarlyStopping. The Determined platform is designed to control the lifetime of the trial at the experiment level, not from model definition code.