Tensorflow: Layer size dependent on batch size?

Question

I am currently trying to get familiar with the Tensorflow library and I have a rather fundamental question that bugs me.

While building a convolutional neural network for MNIST classification I tried to use my own model_fn. In which usually the following line occurs to reshape the input features.

x = tf.reshape(x, shape=[-1, 28, 28, 1]), with the -1 referring to the input batch size.

Since I use this node as input to my convolutional layer,

x = tf.reshape(x, shape=[-1, 28, 28, 1]) 
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)

does this mean that the size all my networks layers are dependent on the batch size?

I tried freezing and running the graph on a single test input, which will only work if I provide n=batch_size test images.

Can you give me a hint on how to make my network run on any input batchsize while predicting? Also I guess using the tf.reshape node (see first node in cnn_layout) in the network definition is not the best input for serving.

I will append my network layer-up and the model_fn

def cnn_layout(features,reuse,is_training):
 with tf.variable_scope('cnn',reuse=reuse):
    # resize input to [batchsize,height,width,channel]
    x = tf.reshape(features['x'], shape=[-1,30,30,1], name='input_placeholder')
    # conv1, 32 filter, 5 kernel
    conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu, name='conv1')
    # pool1, 2 stride, 2 kernel
    pool1 = tf.layers.max_pooling2d(conv1, 2, 2, name='pool1')
    # conv2, 64 filter, 3 kernel
    conv2 = tf.layers.conv2d(pool1, 64, 3, activation=tf.nn.relu, name='conv2')
    # pool2, 2 stride, 2 kernel
    pool2 = tf.layers.max_pooling2d(conv2, 2, 2, name='pool2')
    # flatten pool2
    flatten = tf.contrib.layers.flatten(pool2)
    # fc1 with 1024 neurons
    fc1 = tf.layers.dense(flatten, 1024, name='fc1')
    # 75% dropout
    drop = tf.layers.dropout(fc1, rate=0.75, training=is_training, name='dropout')
    # output logits
    output = tf.layers.dense(drop, 1, name='output_logits')
    return output


def model_fn(features, labels, mode):
    # setup two networks one for training one for prediction while sharing weights
    logits_train = cnn_layout(features=features,reuse=False,is_training=True)
    logits_test = cnn_layout(features=features,reuse=True,is_training=False)

    # predictions
    predictions = tf.round(tf.sigmoid(logits_test),name='predictions')
    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(mode, predictions=predictions)

    # define loss and optimizer
    loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits_train,labels=labels),name='loss')
    optimizer = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='optimizer')
    train = optimizer.minimize(loss, global_step=tf.train.get_global_step(),name='train')

    # accuracy for evaluation
    accuracy = tf.metrics.accuracy(labels=labels,predictions=predictions,name='accuracy')

    # summarys for tensorboard
    tf.summary.scalar('loss',loss)

    # return training and evalution spec
    return tf.estimator.EstimatorSpec(
        mode=mode,
        predictions=predictions,
        loss=loss,
        train_op=train,
        eval_metric_ops={'accuracy':accuracy}
    )

Thanks!

score 1 · Answer 1 · answered Sep 14 '17 at 07:45

1

In the typical scenario, the rank of features['x'] is already going to be 4, with the outer dimension being the actual batch size, so there's no need to resize it.

Let me try to explain.

You haven't shown your serving_input_receiver_fn yet and there are several ways to do that, although in the end the principle is similar across them all. If you're using TensorFlow Serving, then you probably use build_parsing_serving_input_receiver_fn. It's informative to look at the source code:

def build_parsing_serving_input_receiver_fn(feature_spec,
                                            default_batch_size=None):    
  serialized_tf_example = array_ops.placeholder(
      dtype=dtypes.string,
      shape=[default_batch_size],                                      
      name='input_example_tensor')
  receiver_tensors = {'examples': serialized_tf_example}
  features = parsing_ops.parse_example(serialized_tf_example, feature_spec)
  return ServingInputReceiver(features, receiver_tensors)

So in your client, you're going to prepare a request that has one or more Examples in it (let's say the length is N). The server treats the serialized examples as a list of strings which get "fed" into the input_example_tensor placeholder. The shape (which is None) dynamically gets filled in to be the size of the list (N).

Then the parse_example op parses each item in the placeholder and out pops a Tensor for each feature whose outer dimension is N. In your case, you'll have x with shape=[N, 30, 30, 1].

(Note that other serving systems, such as CloudML Engine, do not operate on Example objects, but the principles are the same).

answered Sep 14 '17 at 07:45

rhaertel80

8,254
1
31
47

I didn't use a serving_input_receiver but I froze the graph by using `graph_util.convert_variables_to_constants()` and then calling `with tf.gfile.GFile(output_path, "wb") as f: f.write(output_graph_def.SerializeToString())`. But I guess this also freezes the minibatch size into the graph. So just to be clear. I will need a kind of serving server backend instead of the standart tensorflow lib to execute a trained model ? Since my model is really small and should execute very fast I am looking for a rather lean solution. – openloop Sep 18 '17 at 08:01
I'm not sure whether or not freezing the graph freezes the batch size into graph; off hand, I'd say it should not. Note that you can execute a trained model relatively simply without a server backend. I'd recommend exporting your model as a SavedModel (you can still freeze your graph). That gives you better portability and arguably better libraries for inference (plus the option of easily graduating to a more advanced serving solution if necessary at a later date). See https://stackoverflow.com/a/46139198/1399222 for how to do prediction with a SavedModel. – rhaertel80 Sep 18 '17 at 14:04
If you don't want to use SavedModel, the basic idea is to use `tensorflow.python.training.saver.import_meta_graph` then you can use `session.Run` to feed the inputs and fetch the outputs. The trick is getting the correct names of the input and output tensors. – rhaertel80 Sep 18 '17 at 14:09
The provided thread did really help me with my problem. Prediction is also quite a lot faster now. It wasn't my goal to build a production grade serving but a to find a simple way to run my network locally. Thank you nevertheless. – openloop Sep 19 '17 at 08:16
Glad to hear it! – rhaertel80 Sep 20 '17 at 01:59

score 0 · Accepted Answer · answered Sep 22 '17 at 13:54

I just want to briefly provide my found solution. Since I did not want to build a scalable production grade model, but a simple model runner in python to execute my CNN locally.

To export the model I used,

input_size = 900

def serving_input_receiver_fn():
    inputs = {"x": tf.placeholder(shape=[None, input_size], dtype=tf.float32)}
    return tf.estimator.export.ServingInputReceiver(inputs, inputs)

model.export_savedmodel(
    export_dir_base=model_dir,
    serving_input_receiver_fn=serving_input_receiver_fn)

To load and run it (without needing the model definition again) I used the tensorflow predictor class.

from tensorflow.contrib import predictor

class TFRunner:
    """ runs a frozen cnn graph """
    def __init__(self,model_dir):
        self.predictor = predictor.from_saved_model(model_dir)

    def run(self, input_list):
        """ runs the input list through the graph, returns output """
        if len(input_list) > 1:
            inputs = np.vstack(input_list)
            predictions = self.predictor({"x": inputs})
        elif len(input_list) == 1:
            predictions = self.predictor({"x": input_list[0]})
        else:
            predictions = []
        return predictions

Tensorflow: Layer size dependent on batch size?

2 Answers2

Linked