Why Conv2D has different number of filters in each layer

Question

Learning from this Keras document example

model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same',    # why filter is 32?
                 input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))      # why filter is not changed?
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), padding='same'))     # why filter is changed to 64?
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(512))     # why Dense neurons is 512? not 1024? what's the rule to set the number?

Here are my qeustions:

why in the 1st layer filter is 32 and not changed in the 2nd place but still in 1st layer?
Why in the 2nd layer filter is changed to 64? What is the rule to set the number?
why Dense neurons are 512? not 1024? what's the rule to set the number?

This links is really helpful: https://www.pyimagesearch.com/2018/12/31/keras-conv2d-and-convolutional-layers/. It is discusses when to use large number of filters and when to use small number of filters. — aminrd, Dec 06 '19 at 23:39

thushv89 · Accepted Answer · 2019-12-07T00:35:51.240

3

Why in the 1st layer filter is 32 and not changed in the 2nd place but still in 1st layer?

Number of filters can be any arbitrary number. It's just a matter of having more kernels in that layer. Each filter does a separate convolution on all channels of the input. So 32 filters does 32 separate convolutions on all RGB channels of the input.

Why in the 2nd layer filter is changed to 64? What is the rule to set the number?

Again following the first answer, number of filter on each layer can be anything. Here for example, the second layers has 64 filters doing 64 separate convolutions on all 32 channels of the output of the first layer.

Why Dense neurons are 512? not 1024? what's the rule to set the number?

Again the dense layer can have any number of neurons. For example of you have a 64x64x3 RGB input, your last convolution output will produce (batch_size, 16, 16, 64) (assuming padding='same' and stride of (2,2) on max pool layer) output.

After going through Flatten() layer this will become a (batch_size, 16*16*64) output. Then you convert take this as the input to the dense layer and produce a (batch_size, 512) output (because the Dense layer has 512 neurons). To be exact the Dense layer does the following matrix multiplication. (batch_size, 16*16*64) x (16*16*64, 512) which results in a (batch_size, 512) sized output from the Dense layer.

Note: To set these parameters, best way would be to do hyperparameter optimization w.r.t your dataset.

Edit: What do I mean by separate convolutions

So a filter would represent a single color here. This is for 1D convolution (with padding='valid'). But you get the idea. They are randomly initialized separate filters. Over time, they learn various filters.

edited Dec 07 '19 at 00:35

answered Dec 06 '19 at 23:36

thushv89

10,865
1
26
39

beautifully explained, thanks. Could you please add explanation about "32 separated convolutions"? To my understanding, when do convolution computing, we have a kernel e.g. 3 by 3 array and we use that ONE kernel to do the calculation. Now, by saying "32 separate convolutions", do you mean using the same kernel again and again for 32 times to produce the same 32 results? Why would we want to do it in this way? If not 32 same kernels, then what are those 32 kernels? and where are they from? – Franva Dec 07 '19 at 00:30
@Franva edited my answer. So to make it clear, they are not the same. Because you won't get any value by having the same kernel. In a convolution network these filters are randomly initialized. So each filter learn unique features. – thushv89 Dec 07 '19 at 00:36
great example~! thanks~! so may I use 17 as the 1st filters and 27 as the number of 2nd layers of filters? if not , why? – Franva Dec 07 '19 at 01:45
@Franva. Yes of course the can be any arbitrary number. But as I said, if possible do hyperparameter optimization and find the optimal number of filters. But that's just to improve performance. – thushv89 Dec 07 '19 at 01:47
sure, could you please recommend some tutorial for hyperparameter optimization? thanks~! – Franva Dec 07 '19 at 01:56
1

May be these links will help you [link 1](https://scikit-learn.org/stable/modules/grid_search.html) [link 2](https://www.tensorflow.org/tensorboard/hyperparameter_tuning_with_hparams) [link 3](https://stackoverflow.com/questions/44802939/hyperparameter-tuning-of-tensorflow-model) – thushv89 Dec 07 '19 at 02:51

Why Conv2D has different number of filters in each layer

1 Answers1