1

I have set up Caffe and using FCN-8s model with little change with output classes:

layer {
 name: "score_5classes"
 type: "Convolution"
 bottom: "score"
 top: "score_5classes"
 convolution_param {
    num_output: 2
    pad: 0 
    kernel_size: 1 
    }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "score_5classes"
  bottom: "label"
  top: "loss"
  loss_param {
    normalize: true
  }
}

I have changed last layer output number to 2, because I want to classify my input images into 2 classes, 0 and 1 (So it seems I should have 2 outputs! I cant understand why?! It could be an output matrix with zeros and ones, couldnt it?)

So my questions are:

1.Should I sum these 2 classes ? because I need 1 output

2.The loss is so small! even when the output is far away from the desired! how Caffe calculates the lost layer?

Thanks

Shai
  • 111,146
  • 38
  • 238
  • 371
mehdi
  • 51
  • 4

1 Answers1

0

When doing binary classification, using "SoftmaxWithLoss" with two outputs, is mathematically equivalent to using "SigmoidCrossEntropyLoss". So, if you really only need one output you can set your last layer to num_output: 1 and use "SigmoidCrossEntropyLoss". However, if you want to take advantage of caffe's "Accuracy" layer, you need to use two outputs and "SoftmaxWithLoss" layer.

Regarding your questions:
1. If you opt to use "SoftmaxWithLoss" and you only need one output, take the second output for each pixel as this entry represents the probability of class 1.
I'll leave it to you as an exercise to figure out what you'll get if you take the sum (hint: `"Softmax" output probabilities...)
2. The loss is very small most likely because you have severe class imbalance - most of your pixels are 0 while only very few are 1 (or vice versa), therefore predicting always 0 does not incur such great penalty. If this is your case, I suggest looking at Focal Loss that addresses this issue.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • Thanks for your answer. By sum what I meant was somehow merging them but that's wrong, the second output is the right output as you said. If I use SigmoidCrossEntropyLoss and one output, can't I use Accuracy layer? – mehdi Jul 30 '18 at 08:15
  • @mehdi AFAIK `"Accuracy"` layer is implemented with `"Softmax"` predictions in mind. – Shai Jul 30 '18 at 08:22
  • Aha OK Thanks. Now I'm just using second output and it has negative and positive values, dont really know why it is not producing 0s and 1s instead of negative and positive values but I replace negative ones as 0 and positive as 1, and it seems right. – mehdi Jul 30 '18 at 09:53
  • @mehdi look at [this answer](https://stackoverflow.com/a/33773152/1714410): when predicting you need to replace `"SoftmaxWithLoss"` with a simple `"Softmax"` in order to **explicitly** get probabilities from predictions. – Shai Jul 30 '18 at 10:03
  • The answer provided is the difference of deploy and train prototxt files, and it should have simple loss in deploy. the problem I mentioned is for "train.prototxt" and "SoftmaxWithLoss" – mehdi Jul 30 '18 at 10:11
  • Just can you update the answer and explain more why "SoftmaxWithLoss" needs at least 2 output classes? Thanks – mehdi Jul 30 '18 at 16:03