Transport Layer Security#
Transport Layer Security (TLS) is a protocol for secure network communication. TLS prevents the data being transmitted from being modified or read while it is in transit and allows clients to verify the identity of the server (in this case, the Determined master). Determined can be configured to use TLS for all connections made to the master. That means that all CLI and WebUI connections will be secured by TLS, as well as connections from agents and tasks to the master. Communication between agents that occur as part of distributed training will not use TLS, nor will proxied connections from the master to a TensorBoards or notebook instance.
After the master and agent are configured to use TLS, no additional configuration is needed for tasks run in the cluster. In shells and notebooks, the Determined Python libraries automatically make connections to the master using TLS with the appropriate certificate.
To configure the master to use TLS, set the
security.tls.key options to paths to a TLS certificate file and key file.
When TLS is in use, the master will listen on TCP port 8443 by default, rather than 8080.
If the master’s certificate is not signed by a well-known CA, then the configured certificate file must contain a full certificate chain that goes all the way to a root certificate.
When the Determined master is using TLS, set the
security.tls.enabled agent configuration
true. If the master’s certificate is signed by a well-known
CA, then no other TLS-specific configuration is necessary. Otherwise, for the best security, place
the master’s certificate file somewhere accessible to the agent and set the agent’s
security.tls.master_cert option to the path to that file. For a more convenient but less secure
setup, instead set the
security.tls.skip_verify option to
true. With the latter
configuration, the agent will be unable to verify the identity of the master, but the data sent over
the connection will still be protected by TLS.
If the master’s certificate does not contain the address that the agent is using to connect to the
master (but is otherwise valid), set the
security.tls.master_cert_name option to one of the
addresses in the certificate. For example, the master’s certificate may contain a DNS hostname
corresponding to the public IP address of the master, while the agent connects to the master using
its private IP address to prevent traffic from being routed over the public Internet. In that case,
the option should be set to the DNS name contained in the certificate.
Due to a limitation of Fluent Bit, which Determined uses internally,
the certificate must be valid for at least one hostname that is not an IP address and the
security.tls.master_cert_name option must be set to that hostname if the agent is configured
to connect to the master using an IP address. The hostname does not need to be an actual DNS name
for the master—it is only used for certificate verification.
When dynamic agents and TLS are both in use, the dynamic agents that the master creates will automatically be configured to connect securely to the master over TLS.
To use TLS, the CLI must be configured with a master address starting with
https:// using either
-m flag or
DET_MASTER environment variable.
If the master’s certificate is signed by a well-known CA, then the connection should proceed immediately. If not, the CLI will indicate on the first connection that the master is presenting an untrusted certificate and display a hash of the certificate. You may wish to confirm the hash with your system administrator; in any case, if you confirm the connection to the master, the certificate will be stored on the computer where the CLI is being run and future connections to the master will be made without confirmation.
Let’s Encrypt TLS Certificate Setup#
For more information about the challenge types, visit the Let’s Encrypt documentation.
Installing snapd and Certbot#
This section provides information about installing snapd and Certbot and adding EPEL to RHEL 8 or CentOS 8.
The following websites provide more information about installing snapd and Certbot:
Adding EPEL to RHEL 8#
To add the EPEL repository to a RHEL 8 system, run the following commands:
sudo dnf install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm sudo dnf upgrade
Adding EPEL to CentOS 8#
To add the EPEL repository to a CentOS Stream 8/9 system, run the following commands:
sudo dnf install epel-release sudo dnf upgrade
To install snapd, run the following commands:
sudo yum install snapd sudo systemctl enable --now snapd.socket sudo ln -s /var/lib/snapd/snap /snap
To install Certbot on RHEL or CentOS, run the following command:
sudo snap install --classic certbot
To install Certbot on Debian/Ubuntu, run the following command:
sudo apt-get install certbot
Certbot Certificate Request#
To complete the Certbot certificate request, execute the following steps as the root user:
Register a Let’s Encrypt account
Perform a certificate request
Update the Determined master configuration to use the certificate
The steps are described in detail in the following sections.
Register a Let’s Encrypt Account#
To register an account on Let’s Encrypt, run the following command:
Certbot responds letting you know the account is registered.
To check the account status, run the following command:
Certbot responds with the account details including the account URL, thumbprint, and email contact.
Perform a Certificate Request#
If port 80 of the Determined Master is accessible, you can use a simple HTTP-01 challenge type.
Certificate Creation When the Determined Master is Behind a VPN#
This section provides information about requesting the Let’s Encrypt certificate in environments that do not provide inbound access from Let’s Encrypt to port 80 of the Determined master (e.g., when the Determined master is behind a VPN).
Request a Certificate Using the DNS-01 Challenge#
Run the following command to request a Let’s Encrypt certificate using the DNS-01 challenge type:
certbot certonly --manual --preferred-challenges dns -d <domain>
Certbot responds with a domain token and lets you know that before continuing you should verify that the TXT record has been deployed:
Saving debug log to /var/log/letsencrypt/letsencrypt.log Requesting a certificate for <domain> Please deploy a DNS TXT record under the name: _acme-challenge.<domain>. with the following value: <XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX domain token> Before continuing, verify the TXT record has been deployed. Depending on the DNS provider, this may take some time, from a few seconds to multiple minutes. You can check if it has finished deploying with the aid of online tools, such as the Google Admin Toolbox: https://toolbox.googleapps.com/apps/dig/#TXT/_acme-challenge.<domain>. Look for one or more bolded line(s) below the line ';ANSWER'. It should show the value(s) you've just added. Press Enter to Continue
Do not press Enter before setting up the DNS record.
Set Up the DNS Record#
In the DNS configuration for the domain the Determined master is using, create a record with the following values:
<XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX domain token>
_acme-challenge.<domain>. DNS record has been propagated using one of the following:
nslookup -type=TXT _acme-challenge.<domain>.
You may need to install the
yum install bind-utils
apt install dnsutils
Complete the Certificate Request#
Once you have set up the DNS record, press Enter.
Certbot lets you know it has received the certificate and provides the certificate location, key location, and certificate expiration date.
Update the Determined Master TLS Configuration#
This section describes how to update the Determined master configuration to use the TLS certificate provided by the Let’s Encrypt service.
First, stop the Determined master using the appropriate command. For example, if you installed Determined using Linux packages, run the following command:
systemctl stop determined-master
Then, change the security section of the master configuration file by adding the following lines:
security: tls: cert: /etc/letsencrypt/live/<domain>/fullchain.pem key: /etc/letsencrypt/live/<domain>/privkey.pem
If appropriate, change the master port:
You’ll need to configure the agents to reach this port.
Finally, start the Determined master using the appropriate command. For example, if you installed Determined using Linux packages, run the following command:
systemctl start determined-master
Certbot Certificate Renewal#
To renew the certificate, repeat the certificate creation steps, and restart the Determined master using the appropriate command. For example, if you installed Determined using Linux packages, run the following command:
systemctl restart determined-master