-1

I would like to know what is the interest to add bias b to Softmax function in the case of CNNs

enter image description here

David Parks
  • 30,789
  • 47
  • 185
  • 328
JaG
  • 3
  • 3
  • 1
    Your question needs some refinement. The equation you're showing ( you should type it, not link to images) is showing a fully connected network operation, not a CNN. And neither of these seem to be directly related to softmax in the context of your questoin. Could you update your question with more detail please? – David Parks Aug 30 '18 at 19:50

1 Answers1

0

The formula you linked is a standard affine transformation preceding the application of a pointwise nonlinearity, not the softmax activation function itself. If you'd like to know why a bias term is used in neural networks, please refer to this post: Role of Bias in Neural Networks

Pranav Vempati
  • 558
  • 3
  • 5
  • 16
  • Is the affine transformation just applied before the soft-max function? What is the goal to use affine transformation before a softmax function? – JaG Aug 31 '18 at 13:57
  • The softmax function deterministically maps unscaled logits(the output of the affine transformation) to normalized probability distributions. Thus, the predictions emitted by a softmax activation function can be interpreted as class probabilities. – Pranav Vempati Aug 31 '18 at 15:40
  • 1
    Thanks for your precious help. – JaG Aug 31 '18 at 16:42