The result of the convolution operation is multiple subsets of data are generated per kernel. For example if 5 kernels are applies to an image of dimension WxDx1 (1 channel) then 5 convolutions are applied to the data which generates a 5 dimensional image representation. WxDx1 becomes W'xD'x5 where W' and D' are smaller in dimension that W * D
Is the fact that each kernel is initialised to different values prevent each kernel from learning the same parameters ? If not what prevents each kernel learning the same parameters ?
If the image is RGB instead of grayscale so dimension WxDx3 instead of WxDx1 does this impact how the kernels learns patterns ?