Weighted categorical cross entropy semantic segmentation

Question

I wanted to use a FCN (kind of U-Net) in order to make some semantic segmentation. I performed it using Python & Keras based on Tensorflow backend. Now I have good results, I'm trying to improve them, and I think one way to do such a thing is by improving my loss computation. I know that in my output, the several classes are imbalanced, and using the default categorical_crossentropy function can be a problem. My model inputs and outputs are both in the float32 format, input are channel_first and output and channel_last (permutation done at the end of the model) In the binary case, when I only want to segment one class, I have change the loss function in this way so it can add the weights case by case depending on the content of the output :

def weighted_loss(y_true, y_pred):
    def weighted_binary_cross_entropy(y_true, y_pred):
        w = tf.reduce_sum(y_true)/tf_cast(tf_size(y_true), tf_float32)
        real_th = 0.5-th 
        tf_th = tf.fill(tf.shape(y_pred), real_th) 
        tf_zeros = tf.fill(tf.shape(y_pred), 0.)
        return (1.0 - w) * y_true * - tf.log(tf.maximum(tf.zeros, tf.sigmoid(y_pred) + tf_th)) +
               (1- y_true) * w * -tf.log(1 - tf.maximum(tf_zeros, tf.sigmoid(y_pred) + tf_th))
    return weighted_binary_coss_entropy

Note that th is the activation threshold which by default is 1/nClasses and which I have changed in order to see what value gives me the best results What do you think about it? What about change it so it will be able to compute the weighted categorical cross entropy (in the case of multi-class)

thefifthjack005 · Answer 1 · 2018-07-17T04:46:22.643

Your implementation will work for binary classes , for multi class it will just be

  -y_true * tf.log(tf.sigmoid(y_pred))

and use inbuilt tensorflow method for calculating categorical entropy as it avoids overflow for y_pred<0

you can view this answer Unbalanced data and weighted cross entropy ,it explains weighted categorical cross entropy implementation.

The only change for categorical_crossentropy would be

def weighted_loss(y_true, y_pred):
    def weighted_categorical_cross_entropy(y_true, y_pred):
        w = tf.reduce_sum(y_true)/tf_cast(tf_size(y_true), tf_float32)
        loss = w * tf.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
        return loss
    return weighted_categorical_cross_entropy

extracting prediction for individual class

def loss(y_true, y_pred):
    s = tf.shape(y_true)

    # if number of output classes  is at last
    number_classses = s[-1]

    # this will give you one hot code for your prediction
    clf_pred = tf.one_hot(tf.argmax(y_pred, axis=-1), depth=number_classses, axis=-1)

    # extract the values of y_pred where y_pred is max among the classes
    prediction = tf.where(tf.equal(clf_pred, 1), y_pred, tf.zeros_like(y_pred))

    # if one hotcode == 1 then class1_prediction == y_pred  else class1_prediction ==0
    class1_prediction = prediction[:, :, :, 0:1]
    # you can compute your loss here on individual class and return the loss ,just for simplicity i am returning the class1_prediction
    return class1_prediction

output from model

y_pred = [[[[0.5, 0.3, 0.7],
   [0.6, 0.3, 0.2]]
,
  [[0.7, 0.9, 0.6],
   [0.3 ,0.9, 0.3]]]]

corresponding ground truth

y_true =  [[[[0,  1, 0],
   [1 ,0, 0]]
,
  [[1,0 , 0],
   [0,1, 0]]]]

prediction for class 1

prediction = loss(y_true, y_pred)
# prediction =  [[[[0. ],[0.6]],[0. ],[0. ]]]]

I agree with you but I don't want to have the weights as prior knowledge, I want them to be computed directly during the training, because the weights can be very different from one to the another sample in the train set — Fou, Jul 16 '18 at 12:45
That's why I have used the tensorflow reduce_sum function in the code I've posted, to compute the weights per case during the process — Fou, Jul 16 '18 at 12:47
I have change and it still looks like giving acceptable results but I have no idea if what I have done is true or not, by the way I have a custom `softmax_cross_entropy_with_logits` function that incorporates the activation threshold, which is 1/nClasses by default (I guess) — Fou, Jul 16 '18 at 13:00
I've changed the code in order to explain this part in the binary case (may be wrong, but it looks like working well) — Fou, Jul 16 '18 at 13:08
if i understand it correctly you want output of each of your class , and then apply individual weights on that class? if thats the case then check my edited answer — thefifthjack005, Jul 17 '18 at 02:27

Weighted categorical cross entropy semantic segmentation

1 Answers1