3

I'm trying to train a simple network with tensorflow for the MNIST dataset. At the moment though it is not working. It is basically a modified version of the example given on the TensorFlow website. I just changed a couple lines and removed a layer to see what happened. Here is my code:

#!/usr/bin/python

import input_data
import tensorflow as tf
#MNIST dataset
def weight_variable(shape):
    initial=tf.truncated_normal(shape,stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial=tf.constant(0.1,shape=shape)
    return tf.Variable(initial)
def conv2d(x,W):
    return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')
def max_pool_2x2(x):
    return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')
mnist=input_data.read_data_sets('MNIST_data',one_hot=True)
x=tf.placeholder("float",shape=[None,784])
y=tf.placeholder("float",shape=[None,10])
x_image=tf.reshape(x,[-1,28,28,1])

#Layer 1: convolutional+max pooling

W_conv2=weight_variable([5,5,1,64])
b_conv2=bias_variable([64])

h_conv2=tf.nn.relu(conv2d(x_image,W_conv2)+b_conv2)
h_pool2=max_pool_2x2(h_conv2)

#Layer 2: ReLU+Dropout

W_fc1=weight_variable([7*7*64,1024])
b_fc1=bias_variable([1024])
h_pool2_flat=tf.reshape(h_pool2,[-1,7*7*64])
h_fc1=tf.nn.relu(tf.matmul(h_pool2_flat,W_fc1)+b_fc1)

keep_prob=tf.placeholder("float")
h_fc1_drop=tf.nn.dropout(h_fc1,keep_prob)

#Layer 3: softmax

W_fc4=weight_variable([1024,10])
b_fc4=bias_variable([10])
y_hat=tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc4)+b_fc4)

cross_entropy=-tf.reduce_sum(y*tf.log(y_hat))
train_step=tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
sess=tf.InteractiveSession()
sess.run(tf.initialize_all_variables())
correct_prediction=tf.equal(tf.argmax(y,1),tf.argmax(y_hat,1))
accuracy=tf.reduce_mean(tf.cast(correct_prediction,"float"))

for n in range(20000):
    batch=mnist.train.next_batch(50)
    if n % 100 == 0:
        train_accuracy = accuracy.eval(feed_dict={x:batch[0],y:batch[1],keep_prob:1.0})
        print "step %d,training accuracy %g" % (n,train_accuracy)
    sess.run(train_step,feed_dict={x:batch[0],y:batch[1],keep_prob:0.5})

print "test accuracy %g" % accuracy.eval(feed_dict={x:mnist.test.images, y:mnist.test.labels,keep_prob:1.0})

When I try to execute it it crashes giving me an ArgumentError:

W tensorflow/core/common_runtime/executor.cc:1027] 0x7fceb58a4200 Compute status: Invalid argument: Incompatible shapes: [50] vs. [200]
icedwater
  • 4,701
  • 3
  • 35
  • 50
Stefano Kira
  • 175
  • 2
  • 3
  • 16
  • Can you please add some more context around the error? In particular, it will help to know which operation raised that error. (One suggestion for tracking this down would be to replace the `None` values passed to the initial placeholders for `x` and `y` with `50` (the batch size you are using). That way you should get earlier errors.) – mrry Feb 01 '16 at 19:45
  • "One suggestion for tracking this down would be to replace the None values passed to the initial placeholders for x and y with 50" -- this comment helped me a lot.Thank you. – Gobi Dasu May 07 '17 at 00:05

1 Answers1

3

You need for your stride size to reduce your outputs to the right shape - this should fix it (note the strides compared to yours):

def conv2d(x, W):
  return tf.nn.conv2d(x, W, strides=[1, 2, 2, 1], padding='SAME')

To troubleshoot this kind of issue, try printing .get_shape() for all of your variables. Both Tensor and Variable have this function - it will give you a better sense of what's going on and will help immensely with troubleshooting.

Here's some code that will help - put this after your declaration of h_pool2, it will print the name and shape of each of your vars:

from tensorflow.python.ops.variables import Variable

for k, v in locals().items():
    if type(v) is Variable or type(v) is tf.Tensor:
        print("{0}: {1}".format(k, v))
Tristan Reid
  • 5,844
  • 2
  • 26
  • 31