17
loading...
This website collects cookies to deliver better user experience
A model contains the information that's necessary to create the prediction function.
A model's state can contain more information related to the point of time at which we save the model such as epoch, batch number, learning rate etc.
checkpoint_path = '{}/checkpoints/taxi'.format(OUTDIR)
cp_callback = tf.keras.callbacks.ModelCheckpoint(
checkpoint_path,
save_weights_only=False,
verbose=1)
history = model.fit(x_train, y_train,
batch_size=64,
epochs=3,
validation_data=(x_val, y_val),
verbose=2,
callbacks=[cp_callback])
Early stopping at the end of every batch can be beneficial too.
Epoch: A full training pass over the entire dataset such that each example has been seen once.
1 epoch = (N / batch_size) iterations or steps
steps_per_epoch
defaults to N / batch_size
. However, if we want more granular control, we can adjust it based on requirements for better checkpoints.steps_per_epoch
.steps_per_epoch
. Here, 14.3 is considered to be a stop point (analogous to an epoch).steps_per_epoch
set to 1,43,000 and then, update the number of training examples and stop point, keeping batch size and number of checkpoints constant so that the steps_per_epoch
remains the same.NUM_TRAINING_EXAMPLES = 150,000,000
STOP_POINT = 14.3
BATCH_SIZE = 100
NUM_CHECKPOINTS = 15
steps_per_epoch = (NUM_TRAINING_EXAMPLES / BATCH_SIZE) * (STOP_POINT / NUM_CHECKPOINTS)
cp_callback = tf.keras.callbacks.ModelCheckpoint(
checkpoint_path,
save_weights_only=False,
verbose=1)
history = model.fit(trainds,
batch_size=BATCH_SIZE,
epochs=NUM_CHECKPOINTS,
validation_data=(evalds),
steps_per_epoch=steps_per_epoch,
verbose=2,
callbacks=[cp_callback])