0

I don't understand the mnist example in 'deep mnist for experts' in Tensorflow.

In order to build a deep network, we stack several layers of this type. The second layer will have 64 features for each 5x5 patch.

W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

I don't know why outputchannel is 64.

I think we need to 32*2 * 5*5 filter for 64 outputchannel, so

W_conv2 = weight_variable([5, 5, 32, 2])

first of all, i'm sorry for not good english maybe you hard to understand my ask so i', write sudo code logic

inputimage = arr[1][28][28][32]
w_conv2 = arr[5][5][32][2]
output = [1][28][28][64]
for batch in 1
  for input in 32
    for output in 2
        output[][][][input*output] = conv(inputimage,w_conv2)

i think it make 62 output feature using 32*2 filter and save memory what part is wrong?

thank you

  • Look at [this question](http://stackoverflow.com/questions/42507766/why-am-i-getting-only-one-channeled-output-through-the-tf-nn-conv2d) and [this question](http://stackoverflow.com/questions/34619177/what-does-tf-nn-conv2d-do-in-tensorflow) and you'll get your answer. – jabalazs Mar 02 '17 at 14:14
  • first of all, i'm sorry for not good english maybe you hard to understand my ask so i', write sudo code logic inputimage = arr[1][28][28][32] w_conv2 = arr[5][5][32][2] output = [1][28][28][64] for batch in 1 for input in 32 for output in 2 output[][][][input*output] = conv(inputimage,w_conv2) i think it make 62 output feature using 32*2 filter and save memory what part is wrong? thank you – Jung Sunkyo Mar 03 '17 at 01:25
  • You can also look at my answer [here](http://stackoverflow.com/a/42451067/3941813), to a very similar question. – jabalazs Mar 03 '17 at 01:33
  • in your answer, are the number of features (in_channels * out_channels) ? – Jung Sunkyo Mar 03 '17 at 06:05

2 Answers2

2

The fact is that the output dimension is independent from that of the input.

In fact each step of the convolution is a tensor product between W_conv2 (W) and any (n * m) portion of the input matrix (I): if W has dimension (n * m * k * h) and I (n * m * k), the result of the product I*W is a vector with dimension h.

n, m, and k must be equal in I and W, but you don't have any limitation on h.

Davide Biraghi
  • 626
  • 1
  • 7
  • 17
0

It's bigger because you are using your filter(5x5) and using it on every filter for every inputdata. So it would look like: 32*2*1 (one because you are using one filter)

Christian Frei
  • 471
  • 4
  • 18