1

I have an image dataset with soft labels (i.e. the images don't belong to a single class, but rather I have a probability distribution saying that there's a 66% chance this image belong in one class and 33% chance it belongs in some other class).

I am struggling to figure out how to setup my PyTorch code to allow this to be represented by the model and outputted correctly. The probabilities are saved in a csv file. I have looked at the PyTorch docs and other resources which mention the cross entropy loss function but I am still unclear how to import the data successfully and make use of soft labels.

logankilpatrick
  • 13,148
  • 7
  • 44
  • 125

2 Answers2

1

What you are trying to solve is a multi-label classification task, i.e. instances can be classified with more than one label at a time. You cannot use torch.CrossEntropyLoss since it only allows for single-label targets. So you have two options:

  • Either use a soft version of the nn.CrossEntropyLoss function, this can be done by implementing the loss by hand allowing for soft targets. You can find such implementation on Soft Cross Entropy in PyTorch.

  • Or consider the task a multiple "independent" binary classification tasks, in this case, you would use nn.BCEWithLogitsLoss (this layer contains a sigmoid function).

Ivan
  • 34,531
  • 8
  • 55
  • 100
  • I think his task is still a single-label classification as long as the probabilities sum up to one. – Huxwell Jan 23 '23 at 17:58
0

Pytorch CrossEntropyLoss Supports Soft Labels Natively Now

Thanks to the Pytorch team, I believe this problem has been solved with the current version of the torch CROSSENTROPYLOSS.
You can directly input probabilities for each class as target (see the doc).

Here is the forum discussion that pushed this enhancement.

Lucecpkn
  • 971
  • 6
  • 9