A reusable Tensorflow convolutional Network

Question

I want to reuse code from the Tensorflow "MNIST for Pros" CNN example. My images are 388px X 191px, with only 2 output classes. The original code can be found here. I tried to reuse this code by changing the input & output layers ONLY, as shown below:

input layer

x = tf.placeholder("float", shape=[None, 74108])

y_ = tf.placeholder("float", shape=[None, 2])

x_image = tf.reshape(x, [-1,388,191,1])

output layer

W_fc2 = weight_variable([1024, 2])

b_fc2 = bias_variable([2])

Running the modified code gives a vague stacktrace:

W tensorflow/core/common_runtime/executor.cc:1027] 0x2136510 Compute status: Invalid argument: Input has 14005248 values, which isn't divisible by 3136
     [[Node: Reshape_4 = Reshape[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](MaxPool_5, Reshape_4/shape)]]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1267, in run
    _run_using_default_session(self, feed_dict, self.graph, session)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2763, in _run_using_default_session
    session.run(operation, feed_dict)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 345, in run
    results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 419, in _do_run
    e.code)
tensorflow.python.framework.errors.InvalidArgumentError: Input has 14005248 values, which isn't divisible by 3136
     [[Node: Reshape_4 = Reshape[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](MaxPool_5, Reshape_4/shape)]]
Caused by op u'Reshape_4', defined at:
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 554, in reshape
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 988, in __init__
    self._traceback = _extract_stack()

I did not get any error executing your code using Python 2.7.10, TensorFlow 0.5.0 on Ubuntu 14.10. — agold, Nov 24 '15 at 14:15

score 7 · Accepted Answer · edited Nov 30 '15 at 12:03

7

tensorflow.python.framework.errors.InvalidArgumentError: Input has 14005248 values, which isn't divisible by 3136
 [[Node: Reshape_4 = Reshape[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](MaxPool_5, Reshape_4/shape)]]

But the way you executed it prevents you from seeing the actual line causing the problem. Save it to a file and python <file> it.

  File "<stdin>", line 1, in <module>

But the answer is that you haven't changed the size of your convolutional and pooling layers, so when you used to run 28x28 images through, they eventually shrunk down to a 7x7x(convolutional_depth) layer. Now you're running huge images through, so after the first convolutional layer and the 2x2 maxpool, you've got a VERY BIG thing you're trying to feed in, but you're reshaping to:

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

W_fc1 = weight_variable([7 * 7 * 64, 1024])
b_fc1 = bias_variable([1024])

The output of h_pool2 is much larger with your larger images. You need to shrink them down more - likely with more convolutional and maxpooling layers. You could also try increasing the size of W_fc1 to match the input size that's getting there. It's running through two 2x2 maxpools - each shrinks the size by 2 in the x and y dimensions. 28x28x1 --> 14x14x32 --> 7x7x64. So YOUR images are going from 388 x 191 --> 194 x 95 --> 97 x 47

As a warning, a fully connected layer with 97*47 = 4559 inputs is going to be glacially slow.

edited Nov 30 '15 at 12:03

Mr Lister

45,515
15
108
150

answered Nov 24 '15 at 18:27

dga

21,757
3
44
51

dga: Thanks for the insight. Will try your approach. – user2849678 Nov 25 '15 at 06:22
2

Another solution is to realise that ["There are no fully connected layers; only convolutions" --Yann LeCun](https://www.facebook.com/yann.lecun/posts/10152820758292143). Replace FC layers with convolutions, and use something like "global average pooling" to reduce from any image dimensions to a known shape. – mdaoust Nov 25 '15 at 15:40
@dga & mdaoust: Many thanks. I tried the approach suggested by dga & it worked. I am still looking for a simple 'Conv Net for dummies' kind of explanation. From the explanation above, I get that 2x2 maxpool shrinks the size by 2. Why is the 3rd dimension doubling with each maxpool. – user2849678 Nov 26 '15 at 04:52
It's not doubling because of the maxpool, it's doubling because of the convolution. `W_conv1 = weight_variable([5, 5, 1, 32])` and `W_conv2 = weight_variable([5, 5, 32, 64])` The first runs a 5x5x1 (1 is the color channels) convolution over the image, producing 32 outputs per position. The second runs a 5x5x32 (32 == the number of outputs of the previous convolution) over the output of the first layer, and produces 64 outputs. So it's doubling because the designer of the network specified that way. – dga Nov 26 '15 at 04:58
@dga To be more precise, the first batch of training worked. Since I have just 47 images, I tried training with random changes to images (contrast/brightness) in each batch of training. This gives me new errors. Tried changing the values passed to *AdamOptimizer*, from 1 to 1e-8. Badly missing search grid for hyper parameters! `W tensorflow/core/common_runtime/executor.cc:1027] 0xb518090 Compute status: Invalid argument: ReluGrad input is not finite. : Tensor had NaN values in train_step = tf.train.AdamOptimizer(1e-8).minimize(cross_entropy)` – user2849678 Dec 04 '15 at 10:26
This sounds like a separate question. (I find values of 0.001 to be a very good starting point for Adam, though, and I usually apply an exponential decay to it.) – dga Dec 04 '15 at 17:21

A reusable Tensorflow convolutional Network

input layer

output layer

1 Answers1

Linked