I'm trying to train a convolutional neural network with triplet loss (more about triplet loss here) in order to generate face embeddings (128 values that accurately describe a face).
In order to select only semi-hard triplets (distance(anchor, positive) < distance(anchor, negative)), I first feed all values in a mini-batch and calculate the distances:
distance1, distance2 = sess.run([d_pos, d_neg], feed_dict={x_anchor:input1, x_positive:input2, x_negative:input3})
Then I select the indices of the inputs with distances that respect the formula above:
valids_batch = compute_valids(distance1, distance2, batch_size)
The function compute_valids:
def compute_valids(distance1, distance2, batch_size):
valids = list();
for q in range(0, len(distance1)):
if(distance1[q] < distance2[q]):
valids.append(q)
return valids;
Then I learn only from the training examples with indices returned by this filter function:
input1_valid = [input1[q] for q in valids_batch]
input2_valid = [input2[q] for q in valids_batch]
input3_valid = [input3[q] for q in valids_batch]
_, loss_value, summary = sess.run([optimizer, cost, summary_op], feed_dict={x_anchor:input1_valid, x_positive:input2_valid, x_negative:input3_valid})
Where optimizer is defined as:
model1 = siamese_convnet(x_anchor)
model2 = siamese_convnet(x_positive)
model3 = siamese_convnet(x_negative)
d_pos = tf.reduce_sum(tf.square(model1 - model2), 1)
d_neg = tf.reduce_sum(tf.square(model1 - model3), 1)
cost = triplet_loss(d_pos, d_neg)
optimizer = tf.train.AdamOptimizer(learning_rate = 1e-4).minimize( cost )
But something is wrong because accuracy is very low (50%).
What am I doing wrong?