I found in other questions that to do L2 regularization in convolutional networks using tensorflow the standard way is as follow.
For each conv2d layer, set the parameter kernel_regularizer
to be l2_regularizer
like this
regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
layer2 = tf.layers.conv2d(
inputs,
filters,
kernel_size,
kernel_regularizer=regularizer)
Then in the loss function, collect the reg loss
reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
reg_constant = 0.01 # Choose an appropriate one.
loss = my_normal_loss + reg_constant * sum(reg_losses)
Many people including me made the mistake skipping the 2nd step. That implies the meaning of kernel_regularizer
is not well understood. I have an assumption that I can't confirm. That is
By setting
kernel_regularizer
for a single layer you are telling the network to forward the kernel weights at this layer to the loss function at the end of the network such that later you will have the option (by another piece of code you write) to include them in the final regularization term in the loss function. Nothing more.
Is it correct or is there a better explanation?