1

My dataset consists of vectors that are massive. The data points are all mostly zeros with ~3% of the features being 1. Essentially my data is super sparse and I am attempting to train an autoencoder however my model is learning just to recreate vectors of all zeros.

Are there any techniques to prevent this? I have tried replacing mean squared error with dice loss but it completely stopped learning. My other thoughts would be to use a loss function that favors guessing 1s correctly rather than zeros. I have also tried using a sigmoid and linear last activation with no clear winner. Any ideas would be awesome.

PDPDPDPD
  • 445
  • 5
  • 16

1 Answers1

1

It seems like you are facing a severe "class imbalance" problem.

  1. Have a look at focal loss. This loss is designed for binary classification with severe class imbalance.

  2. Consider "hard negative mining": that is, propagate gradients only for part of the training examples - the "hard" ones.
    see, e.g.:
    Abhinav Shrivastava, Abhinav Gupta and Ross Girshick Training Region-based Object Detectors with Online Hard Example Mining (CVPR 2016).

Shai
  • 111,146
  • 38
  • 238
  • 371