The training dataset contains two classes A and B which we represent as 1
and 0
in our target labels correspondingly. Out labels data is heavily skewed towards class 0
which takes roughly 95% of the data while our class 1
is only 5%. How should we construct our loss function in such case?
I found Tensorflow has a function that can be used with weights:
tf.losses.sigmoid_cross_entropy
weights
acts as a coefficient for the loss. If a scalar is provided, then the loss is simply scaled by the given value.
Sounds good. I set weights to 2.0 to make loss higher and punish errors more.
loss = loss_fn(targets, cell_outputs, weights=2.0, label_smoothing=0)
However, not only the loss didn't go down it increased and the final accuracy on the dataset decreased slightly. Ok, maybe I misunderstood and it should be < 1.0, I tried a smaller number. This didn't change anything, I got almost the same loss and accuracy. O_o
Needless to say that same network trained on the same dataset but with loss weight 0.3 significantly reduces the loss up to x10 times in Torch / PyTorch.
Can somebody please explain how to use loss weights in Tensorflow?