Shortcuts

Install Determined Using det-deploy

This document shows how to deploy Determined locally or in a production cluster using the det-deploy command-line tool, which automates the process of starting Determined as a collection of Docker containers.

In a typical production setup, the master and agents will run on separate machines. They can also run on a single machine, which is especially useful for local development. This guide provides instructions for both scenarios.

Preliminary Setup

Install det-deploy by running

pip install determined-deploy

Configuring and Starting the Cluster

A configuration file is needed to set important values in the master, such as where to save model checkpoints. For information about how to create a configuration file, see Cluster Configuration. There are also sample configuration files available. The configuration file you create must be named master.yaml (but it can be in any directory).

Note

det-deploy will use a default configuration file if you don’t provide one. det-deploy also transparently manages PostgreSQL and Hasura along with the master, so the configuration options related to those services do not need to be set.

Deploying a Single Node Cluster

For local development or small clusters (such as a GPU workstation), you may wish to to install both a master and an agent on the same node. To do this, run:

det-deploy local fixture-up

This will start a master and an agent on that machine. To verify that the master is running, navigate to http://<master-hostname>:8080 in a browser, which should bring up the Determined WebUI. If you’re using your local machine, for example, navigate to http://localhost:8080.

In the WebUI, navigate to the Cluster page, where you should now see slots available (either CPU or GPU, depending on what hardware is available on the machine).

For production deployments, you’ll want to use a cluster configuration file. To provide this configuration file to det-deploy, use:

det-deploy local fixture-up --etc-root <directory containing master.yaml>``

If you want to create more than one agent locally, you can use:

det-deploy local fixture-up --agents <number of agents>

Stopping a Single Node Cluster

To stop a determined cluster, on the machine where a determined cluster is currently running, run

det-deploy local fixture-down

Note

det-deploy local fixture-down will not remove any agents created with det-deploy local agent-up. To remove these agents, use det-deploy local agent-down

Deploying a Standalone Master

In many cases, your Determined cluster will be split across multiple nodes. In this case you will need to start a master and agents separately. In order to start a standalone master, run:

det-deploy local master-up

Note

For production deployments, you’ll want to use a cluster configuration file. To provide this configuration file to det-deploy, use the flag --etc-root <directory containing master.yaml>.

To stop a running master, run:

det-deploy local master-down

Deploying Agents

To deploy a standalone agent, on the machine you want to run the agent run:

det-deploy local agent-up <master_hostname>

This will create an agent on that machine. To verify whether it has successfully connected to the master, navigate to the WebUI and check whether slots have appeared on the Cluster page.

To stop a running agent, run:

det-deploy local agent-down