0

I'm trying to implement the binary cross entropy loss instead of using keras function. Here is my code:

def softmax_fn(val):
  return tf.math.exp(val) / tf.math.reduce_sum(tf.math.exp(val))

def bce_fn(y_true, y_pred):
  y_pred_softmax = softmax_fn(y_pred)
  bce_loss = tf.cast(y_true, tf.float32) * tf.math.log(y_pred_softmax) + (1.0 - tf.cast(y_true, tf.float32)) * tf.math.log(1.0 - y_pred_softmax)
  return -tf.math.reduce_mean(bce_loss) 

My problem is that i have an output mismatch between my loss and the keras one:

# keras loss
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)

y_true = [1.0, 1.0, 0.0, 0.0]
y_pred = [1.0, 0.0, 1.0, 0.0]

print(cross_entropy(y_true,y_pred))  # 0.75320446
print(bce_fn(y_true,y_pred))  # 0.903049

Could somenone explain me why is this happening?

Edit

I found the error: using from_logits=True in the in-built loss function means that we calculate probabilities like a sigmoid function and not like a softmax function. This discussion helped me

def bce_fn(y_true, y_pred):
  y_pred_sigmoid = tf.math.sigmoid(y_pred)  # sigmoid activation
  bce_loss = tf.math.reduce_mean(tf.cast(y_true, tf.float32) * -tf.math.log(y_pred_sigmoid) + (1 - tf.cast(y_true, tf.float32) ) * -tf.math.log(1 - y_pred_sigmoid))
  return bce_loss

Now the in-built function and my custom function have the same output.

1 Answers1

0

Thank you Alexandru Ropotica for the update. For the benefit of community, posting solution in answer section.

# BinaryCrossentropy custom inbuilt function:

def bce_fn(y_true, y_pred):
  y_pred_sigmoid = tf.math.sigmoid(y_pred) 
  bce_loss = tf.math.reduce_mean(tf.cast(y_true, tf.float32) * -tf.math.log(y_pred_sigmoid) + (1 - tf.cast(y_true, tf.float32) ) * -tf.math.log(1 - y_pred_sigmoid))
  return bce_loss

cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)

y_true = [1.0, 1.0, 0.0, 0.0]
y_pred = [1.0, 0.0, 1.0, 0.0]

print(cross_entropy(y_true,y_pred))  
print(bce_fn(y_true,y_pred))

Output:

tf.Tensor(0.75320446, shape=(), dtype=float32)
tf.Tensor(0.75320446, shape=(), dtype=float32)