Transport Layer Security#

Transport Layer Security (TLS) is a protocol for secure network communication. TLS prevents the data being transmitted from being modified or read while it is in transit and allows clients to verify the identity of the server (in this case, the Determined master). Determined can be configured to use TLS for all connections made to the master. That means that all CLI and WebUI connections will be secured by TLS, as well as connections from agents and tasks to the master. By default, proxied connections between the master and notebooks use TLS, while those with shells use SSH.

Note

Communication between agents during distributed training does not use TLS, and neither do proxied connections from the master to TensorBoards.

After the master and agent are configured to use TLS, no additional configuration is needed for tasks run in the cluster. In shells and notebooks, the Determined Python libraries automatically make connections to the master using TLS with the appropriate certificate.

Note

This guide shows you one way to configure TLS.

Master Configuration#

To configure the master to use TLS, set the security.tls.cert and security.tls.key options to paths to a TLS certificate file and key file.

When TLS is in use, the master will listen on TCP port 8443 by default, rather than 8080.

Note

If the master’s certificate is not signed by a well-known CA, then the configured certificate file must contain a full certificate chain that goes all the way to a root certificate.

Agents Configuration#

When the Determined master is using TLS, set the security.tls.enabled agent configuration option to true. If the master’s certificate is signed by a well-known CA, then no other TLS-specific configuration is necessary. Otherwise, for the best security, place the master’s certificate file somewhere accessible to the agent and set the agent’s security.tls.master_cert option to the path to that file. For a more convenient but less secure setup, instead set the security.tls.skip_verify option to true. With the latter configuration, the agent will be unable to verify the identity of the master, but the data sent over the connection will still be protected by TLS.

If the master’s certificate does not contain the address that the agent is using to connect to the master (but is otherwise valid), set the security.tls.master_cert_name option to one of the addresses in the certificate. For example, the master’s certificate may contain a DNS hostname corresponding to the public IP address of the master, while the agent connects to the master using its private IP address to prevent traffic from being routed over the public Internet. In that case, the option should be set to the DNS name contained in the certificate.

When dynamic agents and TLS are both in use, the dynamic agents that the master creates will automatically be configured to connect securely to the master over TLS.

CLI Configuration#

To use TLS, the CLI must be configured with a master address starting with https:// using either the -m flag or DET_MASTER environment variable.

If the master’s certificate is signed by a well-known CA, then the connection should proceed immediately. If not, the CLI will indicate on the first connection that the master is presenting an untrusted certificate and display a hash of the certificate. You may wish to confirm the hash with your system administrator; in any case, if you confirm the connection to the master, the certificate will be stored on the computer where the CLI is being run and future connections to the master will be made without confirmation.

Let’s Encrypt TLS Certificate Setup#

This section describes how to set up a TLS certificate from Let’s Encrypt using Certbot and either the HTTP-01 or the DNS-01 challenge type.

Note

For more information about the challenge types, visit the Let’s Encrypt documentation.

Installing snapd and Certbot#

This section provides information about installing snapd and Certbot and adding EPEL to RHEL 8 or CentOS 8.

The following websites provide more information about installing snapd and Certbot:

Adding EPEL to RHEL 8#

To add the EPEL repository to a RHEL 8 system, run the following commands:

sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm
sudo dnf upgrade

Adding EPEL to CentOS 8#

To add the EPEL repository to a CentOS Stream 8/9 system, run the following commands:

sudo dnf install epel-release
sudo dnf upgrade

Installing snapd#

To install snapd, run the following commands:

sudo yum install snapd
sudo systemctl enable --now snapd.socket
sudo ln -s /var/lib/snapd/snap /snap

Installing Certbot#

To install Certbot on RHEL or CentOS, run the following command:

sudo snap install --classic certbot

To install Certbot on Debian/Ubuntu, run the following command:

sudo apt-get install certbot

Certbot Certificate Request#

To complete the Certbot certificate request, execute the following steps as the root user:

  • Register a Let’s Encrypt account

  • Perform a certificate request

  • Update the Determined master configuration to use the certificate

The steps are described in detail in the following sections.

Register a Let’s Encrypt Account#

To register an account on Let’s Encrypt, run the following command:

certbot register

Certbot responds letting you know the account is registered.

To check the account status, run the following command:

certbot show_account

Certbot responds with the account details including the account URL, thumbprint, and email contact.

Perform a Certificate Request#

Certificate Creation#

If port 80 of the Determined master is accessible, you can use a simple HTTP-01 challenge type.

Certificate Creation When the Determined Master is Behind a VPN#

This section provides information about requesting the Let’s Encrypt certificate in environments that do not provide inbound access from Let’s Encrypt to port 80 of the Determined master (e.g., when the Determined master is behind a VPN).

Request a Certificate Using the DNS-01 Challenge#

Run the following command to request a Let’s Encrypt certificate using the DNS-01 challenge type:

certbot certonly --manual --preferred-challenges dns -d <domain>

Certbot responds with a domain token and lets you know that before continuing you should verify that the TXT record has been deployed:

Saving debug log to /var/log/letsencrypt/letsencrypt.log
Requesting a certificate for <domain>

Please deploy a DNS TXT record under the name:

_acme-challenge.<domain>.

with the following value:

<XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX domain token>

Before continuing, verify the TXT record has been deployed. Depending on the DNS
provider, this may take some time, from a few seconds to multiple minutes. You can
check if it has finished deploying with the aid of online tools, such as the Google
Admin Toolbox: https://toolbox.googleapps.com/apps/dig/#TXT/_acme-challenge.<domain>.
Look for one or more bolded line(s) below the line ';ANSWER'. It should show the
value(s) you've just added.

Press Enter to Continue

Caution

Do not press Enter before setting up the DNS record.

Set Up the DNS Record#

In the DNS configuration for the domain the Determined master is using, create a record with the following values:

FQDN

RECORD TYPE

TTL

Value

_acme-challenge.<domain>.

TXT

900

<XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX domain token>

Ensure the _acme-challenge.<domain>. DNS record has been propagated using one of the following:

  • https://toolbox.googleapps.com/apps/dig/#TXT/_acme-challenge.<domain>., or

  • nslookup -type=TXT _acme-challenge.<domain>.

Note

You may need to install the nslookup utility.

On CentOS:

yum install bind-utils

On Debian/Ubuntu:

apt install dnsutils
Complete the Certificate Request#

Once you have set up the DNS record, press Enter.

Certbot lets you know it has received the certificate and provides the certificate location, key location, and certificate expiration date.

Update the Determined Master TLS Configuration#

This section describes how to update the Determined master configuration to use the TLS certificate provided by the Let’s Encrypt service.

First, stop the Determined master using the appropriate command. For example, if you installed Determined using Linux packages, run the following command:

systemctl stop determined-master

Then, change the security section of the master configuration file by adding the following lines:

security:
   tls:
      cert: /etc/letsencrypt/live/<domain>/fullchain.pem
      key: /etc/letsencrypt/live/<domain>/privkey.pem

If appropriate, change the master port:

port: 443

Important

You’ll need to configure the agents to reach this port.

Finally, start the Determined master using the appropriate command. For example, if you installed Determined using Linux packages, run the following command:

systemctl start determined-master

Certbot Certificate Renewal#

To renew the certificate, repeat the certificate creation steps, and restart the Determined master using the appropriate command. For example, if you installed Determined using Linux packages, run the following command:

systemctl restart determined-master

Note

Most Certbot installations come with automatic renewal. Visit Setting up automated renewals to find out more. To learn how to test automatic renewal, visit the Certbot instructions (CentOS or Debian/Ubuntu).