I am using BatchNorm layer. I know the meaning of setting use_global_stats
that often set false
for training and true
for testing/deploy. This is my setting in the testing phase.
layer {
name: "bnorm1"
type: "BatchNorm"
bottom: "conv1"
top: "bnorm1"
batch_norm_param {
use_global_stats: true
}
}
layer {
name: "scale1"
type: "Scale"
bottom: "bnorm1"
top: "bnorm1"
bias_term: true
scale_param {
filler {
value: 1
}
bias_filler {
value: 0.0
}
}
}
In solver.prototxt, I used the Adam method. I found an interesting problem that happens in my case. If I choose base_lr: 1e-3
, then I got a good performance when I set use_global_stats: false
in the testing phase. However, if I chose base_lr: 1e-4
, then I got a good performance when I set use_global_stats: true
in the testing phase. It demonstrates that base_lr
effects to the batchnorm setting (even I used Adam method)? Could you suggest any reason for that? Thanks all