Perform Simple Metrics Reporting#
In this tutorial, we’ll walk through how to perform simple metrics reporting using detached mode.
For the full script, visit the GitHub repository.
Objectives#
These step-by-step instructions walk you through the following tasks:
Setting up your training environment
Importing and initializing the core context
Setting the master address and executing your training script
Upon completing this user guide, you will:
Grasp the concept and application of detached mode
Successfully report metrics in detached mode
Navigate and visualize trials using the Determined WebUI
Set Up Your Training Environment#
To begin, you’ll need a Determined cluster. If you are new to Determined, you can install the Determined library and start a cluster locally.
Ensure you have Docker running and then run the following command:
pip install determined # If your machine has GPUs: det deploy local cluster-up # If your machine does not have GPUs: det deploy local cluster-up --no-gpu
Note
When deploying locally, the system prompts you to set a strong password.
The command,
pip install determined
, installs thedetermined
library which includes the Determined command-line interface (CLI).
Step 1: Import and Initialize the Core API#
Start by importing the necessary modules for your training code:
import random
from determined.experimental import core_v2
Initialize the core context to recognize the trial with some identifying metadata in the main function:
def main():
core_v2.init(
config=core_v2.Config(
name="detached_mode_example",
),
)
Report your trial and validation metrics:
for i in range(100):
core_v2.train.report_training_metrics(steps_completed=i, metrics={"loss": random.random()})
if (i + 1) % 10 == 0:
loss = random.random()
print(f"validation loss is: {loss}")
core_v2.train.report_validation_metrics(
steps_completed=i, metrics={"loss": loss}
)
core_v2.close()
if __name__ == "__main__":
main()
Step 2: Set Master Address and Execute Your Training Script#
Define the Determined master address:
export DET_MASTER=<DET_MASTER_IP:PORT>
Run your training script:
python3 <my_training_script.py>
Visualize the metrics! Navigate to <DET_MASTER_IP:PORT>
in your web browser to see the
experiment.
Next Steps#
Now that you’ve grasped the essence of simple metrics reporting in detached mode, try more examples using detached mode or learn more about Determined by visiting the tutorials.