0

If I have an image which is WxHx3 (RGB), how do I decide how big to make the filter masks? Is it a function of the dimensions (W and H) or something else? How does the dimensions of the second, third, ... filters compare to the dimensions of the first filter? (Any concrete pointers would be appreciated.)

I have seen the following, but they don't answer the question.

Dimensions in convolutional neural network

Convolutional Neural Networks: How many pixels will be covered by each of the filters?

How do you decide the parameters of a Convolutional Neural Network for image classification?

Community
  • 1
  • 1
No One in Particular
  • 2,846
  • 4
  • 27
  • 32

1 Answers1

0

It would be great if you add details what are you trying to extract from the image and details of the dataset that you are trying to use.

A general assumption can be drawn from Alexnet and ZFnet about the filter mask sizes that are needed to be considered. There is no specific formulation which size should be considered for particular format but the size is kept low if a deeper analysis is required as many smaller details might miss with larger filter sizes. In the above link with Inception networks describes how effectively you can utilize the computing resources. If you dont have the issue of the resources, then from ZFNet you can observe the visualizations in multiple layers, there are many finer details visible. We can call it CNN even if it has one layer of convolution and pooling layer. The number of layers depends on the deep finer requirements.

I am not expert, but can recommend if your dataset is small as few thousands and not many features extraction is required, and if you are not sure about the size you can just simply go with the small sizes (small best and popular is 5x5 - Lenet5).

Raady
  • 1,686
  • 5
  • 22
  • 46