Loss decreases when using semi hard triplets

Question

Here is a short review of triplet learning. I'm using three convolutional neural networks with shared weights in order to generate faces embeddings (anchor, positive, negative), with the loss described here.

Triplet loss:

anchor_output = ...  # shape [None, 128]
positive_output = ...  # shape [None, 128]
negative_output = ...  # shape [None, 128]

d_pos = tf.reduce_sum(tf.square(anchor_output - positive_output), 1)
d_neg = tf.reduce_sum(tf.square(anchor_output - negative_output), 1)

loss = tf.maximum(0., margin + d_pos - d_neg)
loss = tf.reduce_mean(loss)

When I select only the hard triplets (distance(anchor, positive) < distance(anchor, negative)), the loss is very small: 0.08. When I select all triplets, the loss becomes bigger 0.17855. These are just test values for 10 000 triplet pairs, but I get similar results on the actual set (600 000 triplet pairs).

Why does this happen? Is it correct?

I'm using SGD with momentum, starting with learning rate 0.001.

Olivier Moindrot · Accepted Answer · 2018-03-16T06:25:42.447

Here is a quick recap of the terminology on the hardness of triplets:

easy triplets: triplets which have a loss of 0, because d(a,p) + margin < d(a,n)
hard triplets: triplets where the negative is closer to the anchor than the positive, i.e. d(a,n) < d(a,p)
semi-hard triplets: triplets where the negative is not closer to the anchor than the positive, but still has positive loss: d(a, p) < d(a, n) < d(a, p) + margin

What you describe here:

When I select only the hard triplets (distance(anchor, positive) < distance(anchor, negative))

is actually selecting semi-hard triplets and easy triplets. You're removing the hard triplets, so your loss is smaller.

Loss decreases when using semi hard triplets

1 Answers1

Linked