Confusion about model.train()

Question

I am a beginner in pytorch. I saw on github that some deep learning models have model.train(), and some don’t, but they can run normally. I want to know if model.train() is necessary? what's the effect?

score 2 · Answer 1 · answered Sep 07 '20 at 12:25

train and its counterpart eval switch the model between training and evaluation mode.

In training mode, the tracked gradients are generally updated on each evaluation of the model. This is needed to perform the gradient descent used for training. In evaluation mode, they are not.

score 1 · Accepted Answer · answered Sep 07 '20 at 13:06

train mode or eval mode only matters when you have modules that behave asymmetrically (e.g. BatchNorm, Dropout) in training/testing. I would like to emphasize that it does not affect gradient accumulation at all. Even with asymmetrical modules, one can perfectly train a model in eval mode. Some do this in order to save memory in training using a pretrained ImageNet model.

If you don't have any asymmetrical modules, it does not matter at all.

By default, all modules start with training=True.

Confusion about model.train()

2 Answers2