I am running a cloud ML engine job and my tensorboard plots are showing the fraction of zero values for my hidden layers steadily increasing towards 1 as the number of steps increases. How should this plot be interpreted? I believe it is a good thing as more zero values would suggest that the model is getting more "certain" about the predictions that it is making.
Asked
Active
Viewed 2,434 times
1 Answers
6
It generally means your regularization technique and/or activation function is forcing activations to zero. You haven't shared details of your model, but this is common when using dropout, especially with relu activation functions.
Models with lots of zero activations tend to generalize better and therefore give better accuracy.
If you want more details, here's a JMLR paper on dropout.
I do have to note that having activations go to zero is sometimes bad, at least for ReLU activation functions. Basically, they can irreverisbly "die". So if you are seeing poor model quality beware. More information here.

rhaertel80
- 8,254
- 1
- 31
- 47
-
This is very helpful. Thanks for the information -- I think in my case I haven't introduced dropout yet, but I am using ReLU activation functions so it is interesting to think about from this perspective. – reese0106 Aug 23 '17 at 19:16
-
Why does dropout combined with Relu increase the fraction of zeros? Dropout cuts some connections, but it will never cut all incoming connections to a given node. What mechanism is at play here? – max Apr 05 '19 at 10:19