I am feeding cnn features into gpflow model. I am writing the chunks of code from my program here. I am using tape.gradient with Adam optimizer (scheduled lr). My accuracy gets stuck on 47% and surprisingly , my loss still gets reducing. Its very weird. I have debugged the program. CNN features are ok but gp model is not learning .Please can you check the training loop and let me know where am I wrong.
def optimization_step(gp_model: gpflow.models.SVGP, image_data,labels):
with tf.GradientTape(watch_accessed_variables=False)as tape:
tape.watch(gp_model.trainable_variables)
cnn_feat = cnn_model(image_data,training=False)
cnn_feat=tf.cast(cnn_feat,dtype=default_float())
labels=tf.cast(labels,dtype=np.int64)
data=(cnn_feat, labels)
loss = gp_model.training_loss(data)
gp_grads=tape.gradient(loss, gp_model.trainable_variables)
gp_optimizer.apply_gradients(zip(gp_grads, gp_model.trainable_variables))
return loss, cnn_feat
the loop for training is
def simple_training_loop(gp_model: gpflow.models.SVGP, epochs: int = 3, logging_epoch_freq: int = 10):
total_loss = []
features=[]
tf_optimization_step = tf.function(optimization_step, autograph=False)
for epoch in range(epochs):
lr.assign(max(args.learning_rate_clip, args.learning_rate * (args.decay_rate ** epoch)))
data_loader.shuffle_data(args.is_training)
for b in range(data_loader.n_batches):
batch_x, batch_y= data_loader.next_batch(b)
batch_x=tf.convert_to_tensor(batch_x)
batch_y=tf.convert_to_tensor(batch_y)
loss,features_CNN=tf_optimization_step(gp_model, batch_x,batch_y)
I am restoring weights for CNN from checkpoints saved during transfer learning.
With more epochs , loss continue to decrease but accuracy starts decreasing as well.
The gp model declaration is as follows
kernel = gpflow.kernels.Matern32() + gpflow.kernels.White(variance=0.01)
invlink = gpflow.likelihoods.RobustMax(C)
likelihood = gpflow.likelihoods.MultiClass(C, invlink=invlink)
the test Function
cnn_feat=cnn_model(test_x,training=False)
cnn_feat = tf.cast(cnn_feat, dtype=default_float())
mean, var = gp_model.predict_f(cnn_feat)
preds = np.argmax(mean, 1).reshape(test_labels.shape)
correct = (preds == test_labels.numpy().astype(int))
acc = np.average(correct.astype(float)) * 100