10

I want to perform a multilabel image classification task for n classes. I've got sparse label vectors for each image and each dimension of each label vector is currently encoded in this way:

1.0 ->Label true / Image belongs to this class

-1.0 ->Label false / Image does not contain to this class.

0.0 ->missing value/label

E.g.: V= {1.0,-1.0,1.0, 0.0}

For this example V the model should learn, that the corresponding image should be classified in the first and third class.

My problem is currently how to handle the missing values/labels. I've searched through the issues and found this issue: tensorflow/skflow#113 found here

So could do multilable image classification with: tf.nn.sigmoid_cross_entropy_with_logits(logits, targets, name=None)

but TensorFlow has this error function for sparse softmax, which is used for exclusive classification: tf.nn.sparse_softmax_cross_entropy_with_logits(logits, labels, name=None)

So is there something like sparse sigmoid cross entropy? (Couldn't find something) or any suggestions how can I handle my multilabel classification problem with sparse labels.

Kashyap
  • 6,439
  • 2
  • 22
  • 21
ZCDEV
  • 171
  • 1
  • 5
  • I went through the issue that you mentioned in your question (tensor flow/skflow#113 found [here](https://github.com/tensorflow/skflow/issues/113)). The post ends pretty inconclusively.. However I have done multilabel classification before and used `tf.sigmoid_cross_entropy_with_logits()` function for this. But I am not sure why you need the -1.0? and what you mean by missing value. This -1 can have a lot of implications in deriving the gradients for the sigmoid cross entropy – Kashyap Sep 26 '16 at 11:00

3 Answers3

2

I used weighted_cross_entropy_with_logits as the loss function with positive weights for 1s.

In my case, all the labels are equally important. But 0 was ten times more likely to be appeared as the value of any label than 1.

So I weighed all the 1s by calling the pos_weight parameter of the aforementioned loss function. I used a pos_weight (= weight on positive values) of 10. By the way, I do not recommend any strategy to calculate the pos_weight. I think it will depend explicitly on the data in hand.

if real label = 1, weighted_cross_entropy = pos_weight * sigmoid_cross_entropy

Weighted cross entropy with logits is same as the Sigmoid cross entropy with logits, except for the extra weight value multiplied to all the targets with a positive real value i.e.; 1.

Theoretically, it should do the job. I am still tuning other parameters to optimize the performance. Will update with performance statistics later.

0

First I would like to know what you mean by missing data? What is the difference between miss and false in your case?

Next, I think it is wrong that you represent your data like this. You have unrelated information that you try to represent on the same dimension. (If it was false or true it would work)

What seems to me better is to represent for each of your class a probability if it is good, or is missing or is false.

In your case V = [(1,0,0),(0,0,1),(1,0,0),(0,1,0)]

rAyyy
  • 1,239
  • 12
  • 16
0

Ok! So your problem is more about how to handle the missing data I think.

So I think you should definitely use tf.sigmoid_cross_entropy_with_logits()

Just change the target for the missing data to 0.5. (0 for false and 1 for true). I never tried this approach but it should let your network learn without biasing it too much.

rAyyy
  • 1,239
  • 12
  • 16
  • This didn't work well with sparse data. Loss went down nicely but accuracy is nil, because it is good in predicting what label/class is not present rather than which is present. False negatives. – rusty Dec 20 '16 at 09:07