TensorFlow Model Definition

TensorFlowTrial Interface

TensorFlowTrial Interface provides finer-grained control over data loading, model construction and computation flow; it is the interface that most closely supports low-level TensorFlow models. If you are looking for a higher level interface with TensorFlow, see TFKerasTrial and EstimatorTrial Interface.

There are two steps needed to define a TensorFlow model in PEDL using TensorFlowTrial.

  1. Define a make_data_loaders() function to specify data access and any preprocessing in the data pipeline.

  2. Subclass the abstract class TensorFlowTrial. This part of the interface defines the deep learning model, including the graph, loss, and optimizers.

Data Loading via make_data_loaders()

A PEDL user prescribes data access in TensorFlowTrial by writing a make_data_loaders() function. This function should return a pair of batched tf.data.Dataset objects, the first for the training set and the second for the validation set.

def make_data_loaders(experiment_config, hparams):
    ...
    return trainDataset, valDataset

In TensorFlowTrial, users define optimizations in the input pipeline for their training and validation Datasets within the make_data_loaders function, before returning these Datasets. See the TensorFlow Data Input Pipeline Performance guide for more details on capabilities of Datasets. For example, to specify prefetching and batching for Datasets generated from passing TFRecords through a map foo:

def make_data_loaders(experiment_config, hparams):
    trainDataset = tf.data.Dataset.list_files("/path/train-*.tfrecord")
    trainDataset = trainDataset.map(map_func=foo)
    # PEDL requires that make_data_loaders() return batched Datasets.
    trainDataset = trainDataset.batch(batch_size=BATCH_SIZE)
    trainDataset = trainDataset.prefetch(tf.data.experimental.AUTOTUNE)

    valDataset = tf.data.Dataset.list_files("/path/val-*.tfrecord")
    valDataset = valDataset.map(map_func=foo)
    # PEDL requires that make_data_loaders() return batched Datasets.
    valDataset = valDataset.batch(batch_size=BATCH_SIZE)
    valDataset = valDataset.prefetch(tf.data.experimental.AUTOTUNE)

    return trainDataset, valDataset

Record format: PEDL does not restrict the format of the records in the training and validation Datasets returned by make_data_loaders. However, the two Datasets should have the same output_classes, output_shapes, and output_types properties so that they can feed the same graph. When defining the TensorFlow graph in their model code (via the build_graph(record, is_training) method explained below), the parameter record represents the output of the Dataset.

Passing fields from the experiment configuration file: The make_data_loaders function takes two arguments, experiment_config and hparams. We recommend users pass in metadata regarding the data pipeline through the data field in the experiment configuration file; its subfields can be accessed as experiment_config["data"].get("field_name"). Subfield names are up to the user to define. The second argument hparams gives the user access to this trial’s sample of the hyperparameters in the experiment configuration file. As an example, if we add fields in the experiment configuration file as follows:

data:
    path: /my_data_path/
hyperparameters:
    batch_size:
      type: categorical
      vals: [8, 16]

then we could update the above example to be:

def make_data_loaders(experiment_config, hparams):
    # data_path will evaluate to "/my_data_path/"
    data_path = experiment_config["data"]["path"]
    # batch_size will evaluate to either 8 or 16, depending on this trial's
    # sample of hyperparameters.
    batch_size = hparams["batch_size"]

    trainDataset = tf.data.Dataset.list_files(data_path + "train-*.tfrecord")
    trainDataset = trainDataset.map(map_func=foo)
    # PEDL requires that make_data_loaders() return batched Datasets.
    trainDataset = trainDataset.batch(batch_size)
    trainDataset = trainDataset.prefetch(tf.data.experimental.AUTOTUNE)

    valDataset = tf.data.Dataset.list_files(data_path + "val-*.tfrecord")
    valDataset = valDataset.map(map_func=foo)
    # PEDL requires that make_data_loaders() return batched Datasets.
    valDataset = valDataset.batch(batch_size)
    valDataset = valDataset.prefetch(tf.data.experimental.AUTOTUNE)

    return trainDataset, valDataset

Subclassing TensorFlowTrial

class pedl.frameworks.tensorflow.tensorflow_trial.TensorFlowTrial(hparams: Dict[str, Any], *_: Any, **__: Any)

TensorFlow trials are created by subclassing the abstract class TensorFlowTrial. Users must define all abstract methods to create the deep learning model associated with a specific trial, and to subsequently train and evaluate it.

abstract build_graph(record: Any, is_training: tensorflow.python.framework.ops.Tensor) → Dict[str, tensorflow.python.framework.ops.Tensor]

Builds the TensorFlow graph to be used for training and validation.

Parameters
  • record – nested structure representing the symbolic output of the appropriate Dataset (the training Dataset during training or the validation Dataset during validation). Typically, record is a list or dictionary of tf.Tensors. Users should use the tf.Tensors in record as the inputs to their computational graph (see the CIFAR10 and MNIST examples).

  • is_training – boolean Tensor that is True during training and False during validation at graph runtime.

Returns

Dictionary mapping string names to tf.Tensor output nodes. These output tensors will be referenced by name in validation_metrics() and training_metrics().

abstract optimizer() → tensorflow.python.training.optimizer.Optimizer

Builds the optimizer to use for training.

Returns

tf.train.Optimizer to be used for training.

abstract validation_metrics() → List[str]

Specifies names of the validation metrics.

Returns

List of metric string names that will be evaluated on the validation data. Each of these names must correspond to a tf.Tensor value returned by build_graph().

training_metrics() → List[str]

Specifies names of the training metrics.

Returns

List of metric names that will be evaluated on the training data, in addition to the training loss (the tf.Tensor named loss returned by build_graph()). Each of these names must correspond to a tf.Tensor value returned by build_graph().

session_config() → tensorflow.core.protobuf.config_pb2.ConfigProto

Specifies the tf.ConfigProto to be used by the TensorFlow session. By default, tf.ConfigProto(allow_soft_placement=True) is used.

PEDL: Graphs, Sessions, and Control Flow

In part to support its scheduling capabilities, PEDL handles creating sessions and sess.run() calls for users. PEDL will also call the user’s implementation of build_graph() to build the TensorFlow graph for training. The user’s implementation of build_graph() might not return the same graph in each trial of the experiment; build_graph() is responsible for incorporating each trial’s sample of hyperparameters into the graph. PEDL initializes the session, calls build_graph(), and proceeds with training and validation of the graph.

  • Initialization: The session is initialized using session_config().

  • Training steps: Records of the training Dataset specified by the user in make_data_loaders() are fed into build_graph(), with is_training set to True. For each training batch, PEDL will make a sess.run(t_metrics) call, where t_metrics is the union of the loss (the Tensor named "loss" in the output of build_graph()) and training metrics (the Tensors in the output of build_graph() named by training_metrics()). Metrics for each step are computed by averaging over batches. In one training step, batches_per_step batches are fed through the graph. (batches_per_step defaults to 100; a custom value may be set via the experiment configuration file).

  • Validation steps: Records of the validation Dataset in make_data_loaders() are fed into build_graph(), with is_training set to False. For each validation batch, PEDL will make a sess.run(v_metrics) call, where v_metrics contains the Tensors in the output of build_graph() named by training_metrics(). Metrics for each step are computed by averaging over batches. (Support for tf.metrics and other metric aggregation methods for validation is forthcoming.) In one validation step, the validation set is fed through once.

Note: The same graph output by one call to build_graph() is used for both training and validation. To distinguish between these cases (e.g., to use dropout at training time but not at validation time), build_graph takes a parameter is_training, which is a Boolean tf.Tensor.

Examples for ``TensorFlowTrial``