Dropout & batch normalization - does the ordering of layers matter?

Question

I was building a neural network model and my question is that by any chance the ordering of the dropout and batch normalization layers actually affect the model? Will putting the dropout layer before batch-normalization layer (or vice-versa) actually make any difference to the output of the model if I am using ROC-AUC score as my metric of measurement.

I expect the output to have a large (ROC-AUC) score and want to know that will it be affected in any way by the ordering of the layers.

Does this answer your question? [Ordering of batch normalization and dropout?](https://stackoverflow.com/questions/39691902/ordering-of-batch-normalization-and-dropout) — Union find, Nov 06 '20 at 16:15

score 3 · Answer 1 · answered Oct 06 '19 at 11:27

The order of the layers effects the convergence of your model and hence your results. Based on the Batch Normalization paper, the author suggests that the Batch Normalization should be implemented before the activation function. Since Dropout is applied after computing the activations. Then the right order of layers are:

Dense or Conv
Batch Normalization
Activation
Droptout.

In code using keras, here is how you write it sequentially:

model = Sequential()
model.add(Dense(n_neurons, input_shape=your_input_shape, use_bias=False))  # it is important to disable bias when using Batch Normalization
model.add(BatchNormalization())
model.add(Activation('relu'))  # for example
model.add(Dropout(rate=0.25))

Batch Normalization helps to avoid Vanishing/Exploding Gradients when training your model. Therefore, it is specially important if you have many layers. You can read the provided paper for more details.

Dropout & batch normalization - does the ordering of layers matter?

1 Answers1