don't understand mnist example in tensorflow

Question

I don't understand the mnist example in 'deep mnist for experts' in Tensorflow.

In order to build a deep network, we stack several layers of this type. The second layer will have 64 features for each 5x5 patch.

W_conv2 = weight_variable([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

I don't know why outputchannel is 64.

I think we need to 32*2 * 5*5 filter for 64 outputchannel, so

W_conv2 = weight_variable([5, 5, 32, 2])

first of all, i'm sorry for not good english maybe you hard to understand my ask so i', write sudo code logic

inputimage = arr[1][28][28][32]
w_conv2 = arr[5][5][32][2]
output = [1][28][28][64]
for batch in 1
  for input in 32
    for output in 2
        output[][][][input*output] = conv(inputimage,w_conv2)

i think it make 62 output feature using 32*2 filter and save memory what part is wrong?

thank you

Look at [this question](http://stackoverflow.com/questions/42507766/why-am-i-getting-only-one-channeled-output-through-the-tf-nn-conv2d) and [this question](http://stackoverflow.com/questions/34619177/what-does-tf-nn-conv2d-do-in-tensorflow) and you'll get your answer. — jabalazs, Mar 02 '17 at 14:14
first of all, i'm sorry for not good english maybe you hard to understand my ask so i', write sudo code logic inputimage = arr[1][28][28][32] w_conv2 = arr[5][5][32][2] output = [1][28][28][64] for batch in 1 for input in 32 for output in 2 output[][][][input*output] = conv(inputimage,w_conv2) i think it make 62 output feature using 32*2 filter and save memory what part is wrong? thank you — Jung Sunkyo, Mar 03 '17 at 01:25
You can also look at my answer [here](http://stackoverflow.com/a/42451067/3941813), to a very similar question. — jabalazs, Mar 03 '17 at 01:33
in your answer, are the number of features (in_channels * out_channels) ? — Jung Sunkyo, Mar 03 '17 at 06:05

score 2 · Accepted Answer · answered Mar 02 '17 at 12:34

The fact is that the output dimension is independent from that of the input.

In fact each step of the convolution is a tensor product between W_conv2 (W) and any (n * m) portion of the input matrix (I): if W has dimension (n * m * k * h) and I (n * m * k), the result of the product I*W is a vector with dimension h.

n, m, and k must be equal in I and W, but you don't have any limitation on h.

score 0 · Answer 2 · answered Mar 02 '17 at 15:37

0

It's bigger because you are using your filter(5x5) and using it on every filter for every inputdata. So it would look like: 32*2*1 (one because you are using one filter)

answered Mar 02 '17 at 15:37

Christian Frei

471
4
18

don't understand mnist example in tensorflow

2 Answers2