0

I have a CNN for a multilabel classification problem and as a loss function I use the tf.nn.sigmoid_cross_entropy_with_logits .

From the cross entropy equation I would expect that the output would be probabilities of each class but instead I get floats in the (-∞, ∞) .

After some googling I found that due to some internal normalizing operation each row of logits is interpretable as probability before being fed to the equation.

I'm confused about how I can actually output the posterior probabilities instead of floats in order to draw a ROC.

Mewtwo
  • 1,231
  • 2
  • 18
  • 38

1 Answers1

0

tf.sigmoid(logits) gives you the probabilities.

You can see in the documentation of tf.nn.sigmoid_cross_entropy_with_logits that tf.sigmoid is the function that normalizes the logits to probabilities.

GeertH
  • 1,738
  • 9
  • 18
  • Assume having [0.1,10,100] and [1,100,1000]. Applying sigmoid to them the difference between 10 and 100 won't be larger than the difference between 100 and 1000 although the scenarios are equivalent in terms of relevant values? – Mewtwo Aug 23 '17 at 10:44
  • I don't understand your comment. Could you elaborate on it to make it more clear? `abs(sigmoid(10) - sigmoid(100)) >> abs(sigmoid(100) - sigmoid(1000))` but I don't understand how that is relevant to the answer/original question. – GeertH Aug 23 '17 at 11:36
  • I mean that if there are 3 choices with the associated values [0.1,10,100] (case 1) or [1,100,1000](case 2) then by applying the sigmoid function the probabilities won't be assigned in the same way in cases 1,2 due to the difference I mentioned above. I may be wrong, it is just a thought. That's the reason why I rejected the sigmoid function and asked the question :) – Mewtwo Aug 23 '17 at 12:19
  • You are right that the probabilities won't be assigned in the same way. But this is desirable when you do multilabel classification (multiple labels can apply to the same instance) like you mentioned in your question. If an instance can have only one label, `tf.nn.softmax_cross_entropy_with_logits` is a better choice, in which case you use `tf.nn.softmax` to normalize to a probability distribution. – GeertH Aug 23 '17 at 12:32
  • Refreshing my knowledge in statistics, I was convinced that according to the definition of cumulative distribution function, sigmoid is indeed a probability function and so I guess your answer meets my question :) – Mewtwo Aug 23 '17 at 12:58