If I have an image which is WxHx3 (RGB), how do I decide how big to make the filter masks? Is it a function of the dimensions (W and H) or something else? How does the dimensions of the second, third, ... filters compare to the dimensions of the first filter? (Any concrete pointers would be appreciated.)
I have seen the following, but they don't answer the question.
Dimensions in convolutional neural network
Convolutional Neural Networks: How many pixels will be covered by each of the filters?
How do you decide the parameters of a Convolutional Neural Network for image classification?