I'm training a model and using the tensorflow callbacks function to save my training logs and I have a model checkpoint to save my model's weights. During training, every epoch I ran it says "WARNING:tensorflow: Can save best model only with val_acc available, skipping". This is issue 1.
Here are the code I used to be include in callbacks[] during model.fit.
def create_tensorboard_callback(dir_name, experiment_name):
"""
Creates a TensorBoard callback instand to store log files.
Stores log files with the filepath:
"dir_name/experiment_name/current_datetime/"
Args:
dir_name: target directory to store TensorBoard log files
experiment_name: name of experiment directory (e.g. efficientnet_model_1)
"""
log_dir = dir_name + "/" + experiment_name + "/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(
log_dir=log_dir
)
print(f"Saving TensorBoard log files to: {log_dir}")
return tensorboard_callback
# Create ModelCheckpoint callback to save model's progress
checkpoint_path = "model_checkpoints/cp.ckpt"
model_checkpoint = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,
monitor="val_acc",
save_best_only=True, #SAVING BEST ONLY
save_weights_only=True,
verbose=0)
Code for fitting the model with callbacks:
history_101_food_classes_feature_extract = model.fit(train_data,
epochs=3,
steps_per_epoch=len(train_data),
validation_data=test_data,
validation_steps=int(0.15 * len(test_data)),
callbacks=[create_tensorboard_callback("training_logs",
"efficientnetb0_101_classes_all_data_feature_extract"),
model_checkpoint])
Also, I cloned my model and used cloned_mode.load_weights(checkpoint_path)
to evaluate both orignal and cloned model results using model.evaluate(test_data)
Original model scores 70+% accuracy, while cloned_model always returns this exact accuracy. This is the issue 2.
My guess was that I have some previously trained and saved a very high accuracy model, hence issue 1 where it refuses to save at every epoch. But my model_checkpoint path looks clean to me.
And, if I did previously saved a high accuracy to my checkpoint_path
, when I cloned a new model using weights load from that path, why would it give 0.54 accuracy everytime and not something higher? (Issue 2)
I need help. Let me know if you need more info from my side to solve this issue, happy to answer. Thanks. If you want to see the full code, here's the link to it.