what does model.eval() do for batch normalization layer?

Question

Why does the testing data use the mean and variance of the all training data? To keep the distribution consistent? What is the difference between the BN layer using model.train compared to model.val

score 0 · Accepted Answer · answered Oct 22 '22 at 08:18

0

It fixes the mean and var computed in the training phase by keeping estimates of it in running_mean and running_var. See PyTorch Documentation.

As noted there the implementation is based on the description in the paper Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. As one tries to use the whole training data one can get (given similar data train/test data) a better estimate of the mean/var for the (unseen) test set.

Also similar questions have been asked here: What does model.eval() do?

answered Oct 22 '22 at 08:18

ScreamingEagle

66
5

Why use the variance and mean of the training dataset instead of the variance and mean of the testing dataset – 009 Oct 24 '22 at 11:11
Assuming that the training data is a larger set which should represent the real data (testing data) it is better to use this. Assume you treat patients in a hospital. When you've finished training your model you begin to treat each patient one by one. Why would you then use (test-data) to normalize. The point of it is to normalize the way the model benefits the most in training phase. – ScreamingEagle Oct 24 '22 at 21:03

what does model.eval() do for batch normalization layer?

1 Answers1