PEDL offers the ability to easily launch a TensorBoard server for an experiment or list of trials. TensorBoard offers a UI to visualize and compare the metrics of any PEDL trial.
Under the hood, PEDL will schedule a TensorBoard server in a containerized environment on the cluster and proxy HTTP requests to and from the TensorBoard container through the PEDL master. The lifecycle management of TensorBoard in PEDL is left up to the user -- once a new TensorBoard has been scheduled onto the cluster, it will remain scheduled indefinitely until the user explicitly shuts down the TensorBoard container (see the end of this page).
A maximum of 100 trials may be loaded per TensorBoard instance. When starting TensorBoard for an experiment that contains more than 100 trials, the best-performing trials from that experiment will be loaded and all others ignored.
To launch a TensorBoard server, you'll first need to install the PEDL command line interface on a development machine.
Once the PEDL CLI is installed, try launching your first TensorBoard with the
pedl tensorboard start command by either specifying an experiment ID or list of trial IDs:
$ pedl tensorboard start <experiment_id> # loads all trial metrics for an experiment # or $ pedl tensorboard start --trial-ids <list of trial IDs>
$ pedl tensorboard start 3 Scheduling tensorboard tensorboard (id: c68c9fc9-7eed-475b-a50f-fd78406d7c83)... [PEDL] 2019-05-16T21:49:27.865741800Z Writing trial metrics to TensorBoard [PEDL] 2019-05-16T21:49:27.890968200Z [== ] 4% [PEDL] 2019-05-16T21:49:27.945083700Z [======= ] 13% [PEDL] 2019-05-16T21:49:27.998334900Z [=========== ] 22% [PEDL] 2019-05-16T21:49:28.058950400Z [=============== ] 30% [PEDL] 2019-05-16T21:49:28.134723500Z [==================== ] 39% [PEDL] 2019-05-16T21:49:28.213112800Z [======================== ] 48% [PEDL] 2019-05-16T21:49:28.299976800Z [============================ ] 57% [PEDL] 2019-05-16T21:49:28.355130900Z [================================= ] 65% [PEDL] 2019-05-16T21:49:28.431002900Z [===================================== ] 74% [PEDL] 2019-05-16T21:49:28.477746400Z [========================================= ] 83% [PEDL] 2019-05-16T21:49:28.522585700Z [============================================== ] 91% [PEDL] 2019-05-16T21:49:28.596533500Z [==================================================] 100% [PEDL] 2019-05-16T21:49:28.596585200Z TensorBoard log writing complete disconnecting websocket Tensorboard is running at: http://localhost:8080/proxy/c68c9fc9-7eed-475b-a50f-fd78406d7c83-tensorboard-0/
After TensorBoard has been scheduled onto the cluster, the PEDL CLI will open a web browser window pointed to that TensorBoard URL. Back on the terminal, you can use the pedl tensorboard list command to see this TensorBoard server as part of those currently RUNNING on the PEDL cluster:
$ pedl tensorboard list Id | Entrypoint | Registered Time | State --------------------------------------+---------------------------+------------------------------+--------- c68c9fc9-7eed-475b-a50f-fd78406d7c83 | ['/tensorboard-entry.sh'] | 2019-05-07T21:18:15.3710522Z | RUNNING
Since the lifecycle management of the TensorBoard server in PEDL is left up to the user, this server will remain running until it is explicitly shut down. To shut down the server use the
pedl tensorboard kill command:
$ pedl tensorboard kill c68c9fc9-7eed-475b-a50f-fd78406d7c83 Killed tensorboard c68c9fc9-7eed-475b-a50f-fd78406d7c83