0

I have a bit of a problem implementing a soft cross entropy loss in pytorch.

I need to implement a weighted soft cross entropy loss for my model, meaning the target value is a vector of probabilities as well, not hot one vector.

I tried using the kldivloss as suggested in a few forums, but it does not expect a weight vector so I can not use it.

In general I'm a bit confused about how to create a custom loss function with pytorch and how auto grad follows a custom loss function, especially if after the model we apply some function which is not a mathematical, like mapping the output of the model to some vector and calculating the loss on the mapped vector and etc.

Ruli
  • 2,592
  • 12
  • 30
  • 40
yoni peis
  • 11
  • 1
  • 3
  • 1
    pytorch is using backprop to compute the gradients of the loss function w.r.t the trainable parameters. This is how it's [done](https://towardsdatascience.com/pytorch-autograd-understanding-the-heart-of-pytorchs-magic-2686cd94ec95). – Shai Aug 24 '21 at 12:47
  • 1
    Your question is too broad, if you're looking to understand the mechanism behind PyTorch automatic differentiation then you should first read more about it (there are many interesting articles online including the one Shai linked you), then you can come back and ask more specific questions. Since you mention [`nn.KLDivLoss`](https://pytorch.org/docs/stable/generated/torch.nn.KLDivLoss.html) in your question, if you're just looking for a builtin softmax cross-entropy then [nn.CrossEntropyLoss](https://pytorch.org/docs/stable/generated/torch.nn.CrossEntropyLoss.html) should do the job! – Ivan Aug 24 '21 at 13:59
  • Cross entropy loss of pytorch cant be used in this case because ptorch cross entropy loss only accept hard labels, meaning i cant give it vector of probabilties as target. and the kldivloss does not have a weight option – yoni peis Aug 24 '21 at 16:04

2 Answers2

3

(from my answer under another post)

Pytorch CrossEntropyLoss Supports Soft Labels Natively Now

Thanks to the Pytorch team, I believe this problem has been solved with the current version of the torch CROSSENTROPYLOSS.
You can directly input probabilities for each class as target (see the doc).

Here is the forum discussion that pushed this enhancement.

Lucecpkn
  • 971
  • 6
  • 9
2

According to your comment, you are looking to implement a weighted cross-entropy loss with soft labels. Indeed nn.CrossEntropyLoss only works with hard labels (one-hot encodings) since the target is provided as a dense representation (with a single class label per instance).

You can implement the function yourself though. I originally implemented this kind of function in this answer but there wasn't any weighting on the classes. Here instead we take the following three arguments:

  • logits: your unscaled predictions,
  • weights: the weights per-logit, and
  • labels your target tensor.

We have the following loss term:

>>> p = F.log_softmax(pred, 1)
>>> w_labels = weights*labels
>>> loss = -(w_labels*p).sum() / (w_labels).sum()

As long as you operate with differentiable PyTorch builtins, you should be able to backward pass from your custom loss' output. In any case, you can always verify if a backward pass can be called from a given tensor by checking if it has a grad_fn attribute.


You can wrap the logic inside a nn.Module

class SoftCrossEntropyLoss():
   def __init__(self, weights):
      super().__init__()
      self.weights = weights

   def forward(self, y_hat, y):
      p = F.log_softmax(y_hat, 1)
      w_labels = self.weights*y
      loss = -(w_labels*p).sum() / (w_labels).sum()
      return loss
Ivan
  • 34,531
  • 8
  • 55
  • 100