0

I am building a character-level convolutional NN. I have a bunch of samples as training data and each sample has dimension of 3640. I think I have roughly no clue how to resize/reshape dimensions in tensorflow because I keep getting errors I can't fix:

Traceback (most recent call last):
  File "/Users/osopova/Documents/00_KSU_Masters/00_2016_Fall/00_Research/cnn_da/step_4_cnn_4.py", line 87, in my_conv_model
    prediction, loss = learn.models.logistic_regression(pool, y)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/models.py", line 146, in logistic_regression
    'weights', [x.get_shape()[1], y.get_shape()[-1]], dtype=dtype)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 873, in get_variable
    custom_getter=custom_getter)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 700, in get_variable
    custom_getter=custom_getter)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 217, in get_variable
    validate_shape=validate_shape)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 202, in _true_getter
    caching_device=caching_device, validate_shape=validate_shape)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 515, in _get_single_variable
    "but instead was %s." % (name, shape))
ValueError: Shape of a new variable (logistic_regression/weights) must be fully defined, but instead was (?, 1).
Traceback (most recent call last):
  File "/Users/osopova/Documents/00_KSU_Masters/00_2016_Fall/00_Research/cnn_da/step_4_cnn_4.py", line 175, in <module>
Traceback (most recent call last):
  File "/Users/osopova/Documents/00_KSU_Masters/00_2016_Fall/00_Research/cnn_da/step_4_cnn_4.py", line 87, in my_conv_model
    prediction, loss = learn.models.logistic_regression(pool, y)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/models.py", line 146, in logistic_regression
    'weights', [x.get_shape()[1], y.get_shape()[-1]], dtype=dtype)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 873, in get_variable
    custom_getter=custom_getter)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 700, in get_variable
    custom_getter=custom_getter)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 217, in get_variable
    validate_shape=validate_shape)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 202, in _true_getter
    caching_device=caching_device, validate_shape=validate_shape)
  File "/Users/osopova/Applications/anaconda/lib/python2.7/site-packages/tensorflow/python/ops/variable_scope.py", line 515, in _get_single_variable
    "but instead was %s." % (name, shape))
ValueError: Shape of a new variable (logistic_regression/weights) must be fully defined, but instead was (?, 1).

Here is the code:

import tensorflow as tf
from tensorflow.contrib import learn

N_FEATURES = 140*26
N_FILTERS = 10
WINDOW_SIZE = 3

Conv model starts:

def my_conv_model(x, y):

# to form a 4d tensor of shape batch_size x 1 x N_FEATURES x 1
x = tf.reshape(x, [-1, 1, N_FEATURES, 1])

# this will give sliding window of 1 x WINDOW_SIZE convolution.
features = tf.contrib.layers.convolution2d(inputs=x,
                                           num_outputs=N_FILTERS,
                                           kernel_size=[1, WINDOW_SIZE],
                                           padding='VALID')

# Add a RELU for non linearity.
features = tf.nn.relu(features)

# Max pooling across output of Convolution+Relu.
pool = tf.nn.max_pool(features, ksize=[1, 1, 2, 1],
                         strides=[1, 1, 2, 1], padding='SAME')

print("(1) pool_shape", pool.get_shape()) 
print("(1) y_shape", y.get_shape()) 

pool_shape = tf.shape(pool)
pool = tf.reshape(pool, [pool_shape[0], pool_shape[2]*pool_shape[3]])
y = tf.expand_dims(y, 1)

print("(2) pool_shape", pool.get_shape()) 
print("(2) y_shape", y.get_shape()) 

try:
    exc_info = sys.exc_info()

    print("(3) pool_shape", pool.get_shape())
    print("(3) y_shape", y.get_shape())
HERE COMES THE ERROR:
    prediction, loss = learn.models.logistic_regression(pool, y)
    return prediction, loss
except Exception:
    #print(traceback.format_exc())
    pass
finally:
    # Display the *original* exception
    traceback.print_exception(*exc_info)
    del exc_info
#return prediction, loss

The shapes:

(1) pool_shape (?, 1, 1819, 10)
(1) y_shape (?,)
(2) pool_shape (?, ?)
(2) y_shape (?, 1)
(3) pool_shape (?, ?)
(3) y_shape (?, 1)

The main:

def main(unused_argv):

    # training and testing data encoded as one-hot
    data_folder = './data'

    sandyData = np.loadtxt(data_folder+'/sandyData.csv', delimiter=',')
    sandyLabels = np.loadtxt(data_folder+'/sandyLabels.csv', delimiter=',')

    x_train, x_test, y_train, y_test = \
        train_test_split(sandyData, sandyLabels, test_size=0.2, random_state=7)

    x_train = np.array(x_train, dtype=np.float32)
    x_test = np.array(x_test, dtype=np.float32)
    y_train = np.array(y_train, dtype=np.float32)
    y_test = np.array(y_test, dtype=np.float32)

    # Build model
    classifier = learn.Estimator(model_fn=my_conv_model)

    # Train and predict
    classifier.fit(x_train, y_train, steps=100)
    y_predicted = [p['class'] for p in classifier.predict(x_test, as_iterable=True)]
    score = metrics.accuracy_score(y_test, y_predicted)
    print('Accuracy: {0:f}'.format(score))


if __name__ == '__main__':
    tf.app.run() `
Uylenburgh
  • 1,277
  • 4
  • 20
  • 46
  • Can you post the full stack trace? – mrry Oct 09 '16 at 23:54
  • @mrry please take a look, I've updated the post. I used `traceback.format_exc()` to get the (full) stack trace – Uylenburgh Oct 10 '16 at 00:53
  • @mrry is my update sufficient? I used the first advice here: http://stackoverflow.com/questions/3702675/how-to-print-the-full-traceback-without-halting-the-program to print the full stack trace but I am not sure if this is exactly what you wanted me to provide. – Uylenburgh Oct 10 '16 at 16:48
  • That stack trace is great, thanks! But I need one more bit of information: can you please print the value of `y.get_shape()` immediately before you call `logistic_regression()`? – mrry Oct 10 '16 at 19:18
  • @mrry thank you!! Please see the update. Basically, the y shape is (?, 1). – Uylenburgh Oct 10 '16 at 20:08

1 Answers1

1

It looks like the problem is that the pool argument to logistic_regression() does not have a known number of columns. linear_regression() needs to know the number of columns in its x argument to create an appropriately sized weight matrix.

This problem stems from the following line:

pool_shape = tf.shape(pool)
pool = tf.reshape(pool, [pool_shape[0], pool_shape[2]*pool_shape[3]])

Although pool_shape[2]*pool_shape[3] has a constant value, TensorFlow's client-side constant folding doesn't currently handle this expression, so it infers static shape of the tensor pool to be (?, ?) (as your logging output shows). One workaround is to make the following change:

pool_shape = pool.get_shape()
pool = tf.reshape(pool, [-1, (pool_shape[2] * pool_shape[3]).value])

Using pool.get_shape() instead of tf.shape(pool) gives TensorFlow a bit more information about the (partially defined) shape of pool, as a tf.TensorShape object rather than a tf.Tensor object. After this change, both pool_shape[2] and pool_shape[3] have known values, so the number of columns in pool will be known.

mrry
  • 125,488
  • 26
  • 399
  • 400
  • With this approach, I got an error along the lines "Expected int32 but got Dimension". Eventually, I ended up explicitly storing the number of samples in a variable outside the conv. model, and then passing it around to reshape etc. Not the best solution at all, but would work for now. Thank you for your feedback, it is awesome to see that real developers from Tensorflow respond to issues :) – Uylenburgh Oct 11 '16 at 01:21
  • 1
    Hmm, it's possible that the automatic type conversion is a bit weak here. I posted an update that should work - please let me know if it does! – mrry Oct 11 '16 at 11:02