I'm trying to train a Net for semantic segmentation with class imbalance. To account for this, i tried to implement the InfoGainLoss layer and specified the infogain_matrix as was posted here, where I instead used 1- frequency(class) for each diagonal element.
When training the net however, both accuracy and loss immediately converge to 0, even with low base_lr, and the net labels everything as class 0 ("unknown"). My question now is whether the infogain_matrix should be specified as in the post I linked and if so what other reasons could be for this unusual behaviour of the net ( I expected either loss 0 accuracy 1 or loss INF accuracy 0).
Edit:
So when i run the Net with the SoftMaxWithLoss layer instead of InfoGainLoss, it immediately starts to classify everything as the most representative class (class1 with 90%), and doesn't change anymore. My guess now is that i configured the lmdb for the infogain_matrix incorrectly. Does anybody know if one has to specify the dtype of the lmdb for the caffe data layer (images and infogain_matrix are stored as float32), though caffe documentation for the layer does not say so? Or moreover, what dtypes the caffe data layer expects from the lmdb?
lmdbs were generated using the code taken/modified from here, but for the images mean subtraction was performed priorily. I tested lmdb readout in python, and here i had to specify the dtype, as otherwise reshape into the original matrix dimensions throwed errors.
Edit2:
So the mistake indeed was in the lmdb definition, as for dtype=float, data needs to be appended to datum.float_data instead of datum.data, see here. Now everything looks okay and the accuracies and losses are no locker whacky :)