Transport Layer Security¶
Transport Layer Security (TLS) is a protocol for secure network communication. TLS prevents the data being transmitted from being modified or read while it is in transit and allows clients to verify the identity of the server (in this case, the Determined master). Determined can be configured to use TLS for all connections made to the master. That means that all CLI and WebUI connections will be secured by TLS, as well as connections from agents and tasks to the master. Communication between agents that occur as part of distributed training will not use TLS, nor will proxied connections from the master to a TensorBoard or notebook instance.
Configuring TLS¶
Master¶
In order to configure the master to use
TLS, set the security.tls.cert
and security.tls.key
options to
paths to a TLS certificate file and key file.
When TLS is in use, the master will listen on TCP port 8443 by default, rather than 8080.
Note
If the master’s certificate is not signed by a well-known CA, then the configured certificate file must contain a full certificate chain that goes all the way to a root certificate.
Agents¶
When the Determined master is using TLS, set the
security.tls.enabled
agent configuration option to true
. If the master’s certificate is
signed by a well-known CA, then no other TLS-specific configuration is
necessary. Otherwise, for the best security, place the master’s
certificate file somewhere accessible to the agent and set the agent’s
security.tls.master_cert
option to the path to that file. For a more
convenient but less secure setup, instead set the
security.tls.skip_verify
option to true
. With the latter
configuration, the agent will be unable to verify the identity of the
master, but the data sent over the connection will still be protected by
TLS.
If the master’s certificate does not contain the address that the agent
is using to connect to the master (but is otherwise valid), set the
security.tls.master_cert_name
option to one of the addresses in the
certificate. For example, the master’s certificate may contain a DNS
hostname corresponding to the public IP address of the master, while the
agent connects to the master using its private IP address to prevent
traffic from being routed over the public Internet. In that case, the
option should be set to the DNS name contained in the certificate.
Note
Due to a limitation of Fluent Bit <https://fluentbit.io>, which
Determined uses internally, the certificate must be valid for at
least one hostname that is not an IP address and the
security.tls.master_cert_name
option must be set to that hostname
if the agent is configured to connect to the master using an IP
address. The hostname does not need to be an actual DNS name for the
master—it is only used for certificate verification.
When dynamic agents and TLS are both in use, the dynamic agents that the master creates will automatically be configured to connect securely to the master over TLS.
CLI¶
In order to use TLS, the CLI must be configured with a master address
starting with https://
using either the -m
flag or
DET_MASTER
environment variable.
If the master’s certificate is signed by a well-known CA, then the connection should proceed immediately. If not, the CLI will indicate on the first connection that the master is presenting an untrusted certificate and display a hash of the certificate. You may wish to confirm the hash with your system administrator; in any case, if you confirm the connection to the master, the certificate will be stored on the computer where the CLI is being run and future connections to the master will be made without confirmation.
Tasks¶
Once the master and agent are configured to use TLS, no further configuration is required for tasks that are run in the cluster. In shells and notebooks, the Determined Python libraries will automatically make connections to the master using TLS with the appropriate certificate.