feeding tensorflow with raw images dimension error

Question

I am quite new to tensorflow so I tried to use the code in the tutorial to feed some layers with images of size (944,944) and classes yes/no (1,0) to see how it performs, but I havent been able to make it work. Last error I am getting is: "Dimension size must be evenly divisible by 57032704 but is 3565440 for 'Reshape_1' with input shapes: [10,236,236,64], 2 and with input tensors computed as partial shapes: input1 = [?,57032704]".

I do not know if the error comes from any of the reshape operations or because I cant feed the neuros like this. Code is the following:

import tensorflow as tf
import numpy as np
import os
# import cv2
from scipy import ndimage
import PIL

tf.logging.set_verbosity(tf.logging.INFO)

def define_model(features, labels, mode):
"""Model function for CNN."""
# Input Layer
input_layer = tf.reshape(features["x"], [-1,944, 944, 1])

# Convolutional Layer #1
conv1 = tf.layers.conv2d(
  inputs=input_layer,
  filters=32,
  kernel_size=[16, 16],
  padding="same",
  activation=tf.nn.relu)

# Pooling Layer #1
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

# Convolutional Layer #2 and Pooling Layer #2
conv2 = tf.layers.conv2d(
    inputs=pool1,
    filters=64,
    kernel_size=[16, 16],
    padding="same",
    activation=tf.nn.relu)
pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

# Dense Layer
pool2_flat = tf.reshape(pool2, [-1,944*944*64])
dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
dropout = tf.layers.dropout(
    inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

# Logits Layer - raw predictions
logits = tf.layers.dense(inputs=dropout, units=10)

predictions = {
    # Generate predictions (for PREDICT and EVAL mode)
    "classes": tf.argmax(input=logits, axis=1),
    # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
    # `logging_hook`.
    "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
}

if mode == tf.estimator.ModeKeys.PREDICT:
    return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

# Calculate Loss (for both TRAIN and EVAL modes)
loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

# Configure the Training Op (for TRAIN mode)
if mode == tf.estimator.ModeKeys.TRAIN:
    optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
    train_op = optimizer.minimize(
        loss=loss,
        global_step=tf.train.get_global_step())
    return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

# Add evaluation metrics (for EVAL mode)
eval_metric_ops = {
    "accuracy": tf.metrics.accuracy(
        labels=labels, predictions=predictions["classes"])}
return tf.estimator.EstimatorSpec(
    mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

if __name__ == '__main__':
# Load training and eval data
# mnist = tf.contrib.learn.datasets.load_dataset("mnist")
# train_data = mnist.train.images  # Returns np.array
# train_labels = np.asarray(mnist.train.labels, dtype=np.int32)
train_data, train_labels = load_images("C:\\Users\\Heads\\Desktop\\BDManchas_Semi")

eval_data = train_data.copy()
eval_labels = train_labels.copy()

# Create the Estimator
classifier = tf.estimator.Estimator(
    model_fn=define_model, model_dir="/tmp/convnet_model")

# Set up logging for predictions
tensors_to_log = {"probabilities": "softmax_tensor"}
logging_hook = tf.train.LoggingTensorHook(
    tensors=tensors_to_log, every_n_iter=50)

# Train the model
train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": train_data},
    y=train_labels,
    batch_size=10,
    num_epochs=None,
    shuffle=True)
classifier.train(
    input_fn=train_input_fn,
    steps=20000,
    hooks=[logging_hook])

# Evaluate the model and print results
eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"x": eval_data},
    y=eval_labels,
    num_epochs=1,
    shuffle=False)
eval_results = classifier.evaluate(input_fn=eval_input_fn)
print(eval_results)

---------------------------------MORE:----------------------------------

Ok so now that I have working the reshape, I am having another error, the loss during training is NaN. I have been researching about this (here there is a good answer) but for every new function I use, there is a different error. I have tried to change the loss from:

loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

to:

loss = tf.nn.sigmoid_cross_entropy_with_logits(labels=labels, logits=logits)

But it seems to be also problems with the reshape, error says that logits and labels must have the same shape ((10,10) vs (10,)), I have tried to reshape logits and labels but I always get a different error (I guess there is no way to equalize both arrays).

The labels are defined like following:

list_of_classes = []
# if ... class == 1
list_of_classes.append(1)
#else
list_of_classes.append(0)

labels = np.array(list_of_classes).astype("int32")

Any idea on how to use the proper loss?

benjaminplanche · Answer 1 · 2018-05-23T13:52:10.440

Initial Problem

The output of your second pooling layer (pool2) is of shape (1, 236, 236, 64) (convolutions and poolings reduced the size of your tensor), so trying to reshape it to (-1, 944*944*64) (pool2_flat) throws an error.

To avoid this, you could define pool2_flat as:

pool2_shape = tf.shape(pool2)
pool2_flat = tf.reshape(pool2, [-1, pool2_shape[1] * pool2_shape[2] * pool2_shape[3]])
# or directly pool2_flat = tf.reshape(pool2, [-1, 236 * 236 * 64])
# if your dimensions are fixed...

# or more simply, as suggested by @xdurch0:
pool2_flat = tf.layers.flatten(pool2)

Regarding your edit

Not knowing how you are defining your labels, it is hard to tell what is done wrong. The labels must be of shape (None,) (class IDs for each image in the batch) while the logits must be of shape (None, nb_classes) (estimated probability for each class, for each image in the batch).

The following code is working for me:

def define_model(features, labels, mode):
    """Model function for CNN."""
    # Input Layer
    input_layer = tf.reshape(features["x"], [-1,944, 944, 1])

    # Convolutional Layer #1
    conv1 = tf.layers.conv2d(
      inputs=input_layer,
      filters=32,
      kernel_size=[16, 16],
      padding="same",
      activation=tf.nn.relu)

    # Pooling Layer #1
    pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

    # Convolutional Layer #2 and Pooling Layer #2
    conv2 = tf.layers.conv2d(
        inputs=pool1,
        filters=64,
        kernel_size=[16, 16],
        padding="same",
        activation=tf.nn.relu)
    pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], strides=2)

    # Dense Layer
    pool2_flat = tf.layers.flatten(pool2)
    dense = tf.layers.dense(inputs=pool2_flat, units=1024, activation=tf.nn.relu)
    dropout = tf.layers.dropout(
        inputs=dense, rate=0.4, training=mode == tf.estimator.ModeKeys.TRAIN)

    # Logits Layer - raw predictions
    logits = tf.layers.dense(inputs=dropout, units=10)

    predictions = {
        # Generate predictions (for PREDICT and EVAL mode)
        "classes": tf.argmax(input=logits, axis=1),
        # Add `softmax_tensor` to the graph. It is used for PREDICT and by the
        # `logging_hook`.
        "probabilities": tf.nn.softmax(logits, name="softmax_tensor")
    }

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode=mode, predictions=predictions)

    # Calculate Loss (for both TRAIN and EVAL modes)
    loss = tf.losses.sparse_softmax_cross_entropy(labels=labels, logits=logits)

    # Configure the Training Op (for TRAIN mode)
    if mode == tf.estimator.ModeKeys.TRAIN:
        optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.001)
        train_op = optimizer.minimize(
            loss=loss,
            global_step=tf.train.get_global_step())
        return tf.estimator.EstimatorSpec(mode=mode, loss=loss, train_op=train_op)

    # Add evaluation metrics (for EVAL mode)
    eval_metric_ops = {
        "accuracy": tf.metrics.accuracy(
            labels=labels, predictions=predictions["classes"])}
    return tf.estimator.EstimatorSpec(
        mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

if __name__ == '__main__':
    # Load training and eval data
    # mnist = tf.contrib.learn.datasets.load_dataset("mnist")
    # train_data = mnist.train.images  # Returns np.array
    # train_labels = np.asarray(mnist.train.labels, dtype=np.int32)

    def mock_load_images(path):
        nb_classes = 10
        dataset_size = 100
        train_data = np.random.rand(dataset_size, 944, 944).astype(np.float32)
        list_of_classes = [np.random.randint(nb_classes) for i in range(dataset_size)]
        train_labels = np.array(list_of_classes, dtype=np.int32)
        return train_data, train_labels

    train_data, train_labels = mock_load_images("C:\\Users\\Heads\\Desktop\\BDManchas_Semi")

    # Create the Estimator
    classifier = tf.estimator.Estimator(
        model_fn=define_model, model_dir="/tmp/convnet_model")

    # Set up logging for predictions
    tensors_to_log = {"probabilities": "softmax_tensor"}
    logging_hook = tf.train.LoggingTensorHook(
        tensors=tensors_to_log, every_n_iter=50)

    # Train the model
    train_input_fn = tf.estimator.inputs.numpy_input_fn(
        x={"x": train_data},
        y=train_labels,
        batch_size=1,
        num_epochs=None,
        shuffle=True)
    classifier.train(
        input_fn=train_input_fn,
        steps=20000,
        hooks=[logging_hook])

    # ...

I would advise using `tf.layers.flatten` to avoid annoying manual reshaping. — xdurch0, May 21 '18 at 15:42
if i use that manual reshaping it raises another error; however using tf.layers.flatten seems to work, ill update the post if everything works — Adrián Arroyo Perez, May 22 '18 at 08:30
editted post with label definition, thanks. Im gonna try to use that shape. What is exactly the purpose of the tf.placeholder function? — Adrián Arroyo Perez, May 23 '18 at 10:08
See the [doc](https://www.tensorflow.org/api_docs/python/tf/placeholder) for `tf.placeholder` (I'm basically just creating an empty tensor for the sake of the demonstration). The way you are creating your labels seems correct though (assuming your `list_of_classes` ends up being of shape `(batch_size,)`) — benjaminplanche, May 23 '18 at 10:26
So, is it possible to make sure that list_of_classes has that shape? I mean, batch_size is load afterwards, isnt it? — Adrián Arroyo Perez, May 23 '18 at 11:47
It could be checked simply through debugging, or printing the tensors' static shapes. — benjaminplanche, May 23 '18 at 12:00
so yeah, shapes do not match, how can I reshape it? Ive tried with np.reshape(list,(10,)) but it doesnt work — Adrián Arroyo Perez, May 23 '18 at 12:58
I can't really help you much further without your proper current code. As mentioned, it would be simpler to create a different question if you can't find out the error with your labels. Still, I updated my code above to mock your `load_image()` function. Hopefully it will help you. — benjaminplanche, May 23 '18 at 13:53

score -1 · Accepted Answer · answered May 22 '18 at 10:49

-1

So the solution was to change the line:

pool2_flat = tf.reshape(pool2, [-1,944*944*64])

for the line:

pool2_flat = tf.layers.flatten(pool2)

Also I needed to use 512x512 resized images instead of 944x944 because it did not fit in memory...

answered May 22 '18 at 10:49

Adrián Arroyo Perez

375
5
20

feeding tensorflow with raw images dimension error

2 Answers2

Initial Problem

Regarding your edit