Training Sparse Autoencoders

Question

My dataset consists of vectors that are massive. The data points are all mostly zeros with ~3% of the features being 1. Essentially my data is super sparse and I am attempting to train an autoencoder however my model is learning just to recreate vectors of all zeros.

Are there any techniques to prevent this? I have tried replacing mean squared error with dice loss but it completely stopped learning. My other thoughts would be to use a loss function that favors guessing 1s correctly rather than zeros. I have also tried using a sigmoid and linear last activation with no clear winner. Any ideas would be awesome.

score 1 · Answer 1 · answered Oct 15 '20 at 05:45

It seems like you are facing a severe "class imbalance" problem.

Have a look at focal loss. This loss is designed for binary classification with severe class imbalance.
Consider "hard negative mining": that is, propagate gradients only for part of the training examples - the "hard" ones.
see, e.g.:
Abhinav Shrivastava, Abhinav Gupta and Ross Girshick Training Region-based Object Detectors with Online Hard Example Mining (CVPR 2016).

Training Sparse Autoencoders

1 Answers1

Linked