The master allocates cluster resources (slots) among the active experiments using a fair-share scheduling policy. In other words, slots are divided among the active experiments according to the demand (number of desired concurrent tasks) of each experiment. For instance, in an eight-GPU cluster running two experiments with demands of three and six, the scheduler assigns three slots and five slots respectively. As new experiments become active or the resource demand of an active experiment changes, the scheduler will adjust how slots are allocated to experiments as appropriate.
Scheduling behavior can be configured via the
resources section of
the experiment config file; see Experiment Configuration for