Install Determined Using det-deploy¶
This document shows how to deploy Determined locally or in a production
cluster using the
det-deploy command-line tool, which automates the
process of starting Determined as a collection of Docker containers.
In a typical production setup, the master and agents will run on separate machines. They can also run on a single machine, which is especially useful for local development. This guide provides instructions for both scenarios.
det-deploy by running
pip install determined-deploy
Configuring and Starting the Cluster¶
A configuration file is needed to set important values in the master, such as where to save model checkpoints. For information about how to create a configuration file, see Cluster Configuration. There are also sample configuration files available.
det-deploy will use a default configuration file if you don’t provide one.
It also transparently manages PostgreSQL along with
the master, so the configuration options related to those services do
not need to be set.
Deploying a Single-Node Cluster¶
For local development or small clusters (such as a GPU workstation), you may wish to install both a master and an agent on the same node. To do this, run one of the following commands:
# If the machine has GPUs: det-deploy local cluster-up # If the machine doesn't have GPUs: det-deploy local cluster-up --no-gpu
This will start a master and an agent on that machine. To verify that the
master is running, navigate to
http://<master-hostname>:8080 in a
browser, which should bring up the Determined WebUI. If you’re using
your local machine, for example, navigate to
In the WebUI, navigate to the
Cluster page, where you should now see
slots available (either CPU or GPU, depending on what hardware is
available on the machine).
For production deployments, you’ll want to use a cluster
configuration file. To provide this
configuration file to
det-deploy local cluster-up --master-config-path <path to master.yaml>
If you want to create more than one agent locally, you can use:
det-deploy local cluster-up --agents <number of agents>
Stopping a Single-Node Cluster¶
To stop a Determined cluster, on the machine where a Determined cluster is currently running, run
det-deploy local cluster-down
det-deploy local cluster-down will not remove any agents created
det-deploy local agent-up. To remove these agents,
det-deploy local agent-down.
Deploying a Standalone Master¶
In many cases, your Determined cluster will be split across multiple nodes. In this case you will need to start a master and agents separately. In order to start a standalone master, run:
det-deploy local master-up
For production deployments, you’ll want to use a cluster configuration file.
To provide this configuration file to
det-deploy, use the flag
--master-config-path <path to master.yaml>.
To stop a running master, run:
det-deploy local master-down
To deploy a standalone agent on a machine, run one of the following commands:
# If the machine has GPUs: det-deploy local agent-up <master_hostname> # If the machine doesn't have GPUs: det-deploy local agent-up --no-gpu <master_hostname>
This will create an agent on that machine. To verify whether it has
successfully connected to the master, navigate to the WebUI and check
whether slots have appeared on the
To stop a running agent, run:
det-deploy local agent-down