2

I'm using fully convolutional networks for semantic segmentation in Caffe, using the Cityscapes dataset.

This script allows to convert IDs of classes, and says to set IDs of classes to ignore at 255, and "ignore these labels during training". How do we do that in practice ? I mean, how do I 'tell' my network that 255 is not a true class as the other integers ?

Thanks for giving me an intuition behind it.

Shai
  • 111,146
  • 38
  • 238
  • 371
MeanStreet
  • 1,217
  • 1
  • 15
  • 33
  • I don't know how to do that in Caffe, but have you considered actually stripping out those samples from the dataset before you start training? – Fred Apr 26 '18 at 13:18
  • I mentioned Caffe but it's more a conceptual question. Semantic segmentation means assigning a class to each pixel of an image, so my label is also an image, with a class for each pixel. Therefore, only parts of my images are meant to be ignored (like background zones...), I can't just throw away an image because it contains background – MeanStreet Apr 26 '18 at 13:22

1 Answers1

3

Using, e.g. "SoftmaxWithLoss" layer, you can add a loss_param { ignore_label: 255 } to tell caffe to ignore this label:

layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "prediction"
  bottom: "labels_with_255_as_ignore"
  loss_weight: 1
  loss_param: { ignore_label: 255 }
}

I did not check it, but I believe ignore_label is also used by InfogainLoss loss and some other loss layer.

Shai
  • 111,146
  • 38
  • 238
  • 371
  • 1
    Great, thanks. I still have a conceptual point I don't understand: what will my network 'learn' from these zones ? For example, if my network predicts, for a car, a zone that overflows on the background (the sky), the loss won't give it penalty for that, altough it's wrong. Once trained, it will still include sky when segmenting cars, won't it ? – MeanStreet Apr 26 '18 at 13:32
  • 1
    @Mean-Street AFAIK you do not assign "ignore" label to background - but rather to boundary regions where the annotations are not accurate: so if you miss the boundary by a few pixels you do not get penalize (since it might be due to labeling inaccuracy), but the background region/label is not ignored. BTW, using [`"InfogainLoss"`](https://stackoverflow.com/q/27632440/1714410) you can assign different "weights" to each mistake: one cost for predicting BG for object and another for predicting object in BG region... – Shai Apr 26 '18 at 13:37
  • In that case you're right. But let's way I want to build a network that segments only cars and pedestrians. Everything else will have a 'not car nor pedestrian' label, with a lot of big zones with that label. In that case, I should not ignore that 'not car nor pedestrian' label, right ? Because of what I said before. But how will it behave for this class ? It could be so much things, I feel like the network won't be able to learn features for it. Would it be a problem ? I hope I'm clear – MeanStreet Apr 26 '18 at 13:52
  • 1
    @Mean-Street you are clear. There are two issues here: 1. "not car/ped" class has large intra class variation. 2. You have severe imbalance in your training data. Regarding the first issue, usually if your net is big enough it should not be a problem. The second issue is much harder, try reading about "focal loss" – Shai Apr 26 '18 at 14:04
  • @Shai if I have 5 classes with the numbers between 0 and 4, and I define a region to be ignored with the `ignore_label=255`, and I have given zero weights to those regions with label 255. The question is that the number of outputs in the last classification layer of FCNs should be `num_output: 5` or `num_output: 6`? Thanks – S.EB Sep 12 '18 at 07:51
  • @S.EB if you are using `ignore_label` then you do not predict these labels, they are only in the ground truth marking pixels to be ignored during training. – Shai Sep 12 '18 at 08:00
  • @Shai I am facing an issue that is why I asked this question. At first I did not have any mask to be considered and I set the ignore_label to be `5`, and the `num_output: 5` (i.e., class: 0,1,2,3,4), the results was looking good. However, once I defined a mask (which is rectangle around the body region) and labeled outside of body region as `255`, which is `ignore_label` (Itrained a new model with mask and ignore_label). The output prediction looks very bad instead of improvement. What can be a possible reason for this? – S.EB Sep 12 '18 at 09:10
  • @S.EB once you intorduce `ignore_label` pixels you are training on fewer examples. moreover, it might be the case that "interesting" pixels (in terms of contributing to learning) are now ignored. – Shai Sep 12 '18 at 09:12
  • @Shai, data is 3D, and the network does the augmentations on the valid labels area. – S.EB Sep 12 '18 at 09:22
  • @Shai Should I open a new question for this? – S.EB Sep 12 '18 at 09:31
  • @Shai Oh no, i have reached to "You have reached your question limit" :( – S.EB Sep 12 '18 at 09:50
  • @S.EB I guess you need to follow the instructions [here](https://meta.stackexchange.com/q/86997/202617). – Shai Sep 12 '18 at 10:08
  • 1
    @Shai Thanks a lot for your help – S.EB Sep 12 '18 at 15:38