I have trained a sequential model in keras, with sparse vectors as inputs (padded_inputs_multil
for training and padded_inputs_tr
for testing) and dense vectors as output (target_multil_array
for training and target_tr_r_array
for testing):
model_mul=keras.models.Sequential()
model_mul.add(keras.layers.LSTM(units=172, batch_input_shape=(None, 29, 22), dropout=0.2, recurrent_dropout=0.2, return_sequences=False))
model_mul.add(Dense(300, activation='tanh'))
model_mul.compile(loss='cosine_similarity', optimizer='adam', metrics=[tf.keras.metrics.CosineSimilarity(axis=1)])
model_mul.summary()
history_mul=model_mul.fit(padded_inputs_multil, target_multil_array, epochs=1, validation_data=(padded_inputs_tr, target_tr_r_array))
And I get a cosine similarity of .4607, in the following output:
Train on 794870 samples, validate on 199108 samples
Epoch 1/1
794870/794870 [==============================] - 2694s 3ms/step - loss: -0.4678 - cosine_similarity: 0.4522 -
val_loss: -0.4152 - val_cosine_similarity: 0.4607
However, when I evaluate the model, I get a lower value of cosine similarity:
results_mul = model_mul.evaluate(padded_inputs_tr, target_tr_r_array)
print(results_mul)
[-0.4152175833690755, 0.44675499200820923]
Then, the worse problem: if I compute the predicted vectors and compare them with the target vectors, I get a mean cosine similarity that is much, much lower (slightly higher than .40). I can't understand why, since on the tensorflow documentation I find that CosineSimilarity keeps the average cosine similarity between predictions and labels.
prediction_mul = model_mul.predict(padded_inputs_tr)
column_names = ['prediction_multil', 'target_multil', 'cos_pred_target']
df = pd.DataFrame(columns = column_names)
df['prediction_multil'] = [vec for vec in prediction_mul]
df['target_multil'] = [vec for vec in target_tr_r_array]
def cos_sim(a, b):
dot_product = np.dot(a, b)
norm_a = np.linalg.norm(a)
norm_b = np.linalg.norm(b)
return dot_product / (norm_a * norm_b)
cos = []
for index, row in df.iterrows():
# print(cos_sim(row['prediction_multil'], row['target_multil']))
cos.append(cos_sim(row['prediction_multil'], row['target_multil']))
df['cos_pred_target'] = [value for value in cos]
statistics.mean(df['cos_pred_target'])
Do you know what I might be doing wrong? Thanks in advance :)