Tensorflow: proper queueing/batching structure using training and validation set

Question

I am trying to replicate the structure used in the TensorBoard MNIST example from the recent 2017 dev summit (code found here). In it, feed_dict's are used to alternate between training and validation sets; however, they use the very non-transparent mnist.train.next_batch, which makes it really difficult to iterate your own off of.

Admittedly, this may also be because I'm struggling to understand the queueing implementation in Tensorflow, and explicit examples seem to be in short supply, especially for TF > v1.0.

I've made my own attempt at an image-classifying CNN based on various examples I stumbled across. Originally I had it working with just training data by storing the data in pre-loaded variables (its a small data set). I assumed it would be easier to get the train/valid swap working via feeding data from filenames so I tried to change it to that.

Between changing the format and trying to implement the feed_dict train/valid structure, I get the following -

Error: "You must feed a value for placeholder tensor 'input/Placeholder_2' with dtype string".

Any tips as to how to get it working or further explanation as to how the slicer/train.batch/QueueRunner actually work together would be of great help, as I have found the Tensorflow tutorial to be lacking in terms of explaining the basic workflow between them.

I have a feeling I have the train.batch in the completely wrong spot and that it should probably be in the feed_dict def, but no idea otherwise. Thanks!

import tensorflow as tf
from tensorflow.python.framework import dtypes

# Input - 216x216x1 images; ~900 training images, ~350 validation
# Want to do batches of 5 for training, 20 for validation

learn_rate = .0001
drop_keep = 0.9
train_batch = 5
test_batch = 20
epochs = 1
iterations = int((885/train_batch) * epochs)        

#
#
# A BUNCH OF (graph-building) HELPER DEFINITIONS EXCLUDED FOR BREVITY
#
#




#x_init will be fed a list of .jpg filenames (ex: [/file0.jpg, /file1.jpg, ...])
#y_init will be fed an array of one-hot classes (ex: [[0,1,0], [1,0,0], ...])

sess = tf.InteractiveSession()

with tf.name_scope('input'):
    batch_size = tf.placeholder(tf.int32)
    keep_prob = tf.placeholder(tf.float32)
    x_init = tf.placeholder(dtype=tf.string, shape=(None))
    y_init = tf.placeholder(dtype=np.int32, shape=(None,3)) #3 classes

    image, label = tf.train.slice_input_producer([x_init, y_init])
    file = tf.read_file(image)
    image = tf.image.decode_jpeg(file, channels=1)
    image = tf.cast(image, tf.float32)
    image.set_shape([216,216,1])
    label = tf.cast(label, tf.int32)
    images, labels = tf.train.batch([image, label], batch_size=batch_size)



conv1 = conv_layer(images, [5,5,1], 40, 'conv1')
#
#
# skip the rest of graph defining/functions (merged,train_step)
# very similar to what is found in the MNIST example.
#
#
tf.summary.scalar('accuracy', accuracy)
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter(OUTPUT_LOC + '/train',sess.graph)
test_writer = tf.summary.FileWriter(OUTPUT_LOC + '/test')

sess.run(tf.global_variables_initializer())



#xTrain, yTrain, xTest, yTest are the train/valid images/labels lists
def feed_dict(train=True):
    if train:
        batch = train_batch
        keep = drop_keep
        xval = xTrain
        yval = yTrain
    else:
        batch = test_batch
        keep = 1
        xval = xTest
        yval = yTest
    return({x_init:xval, y_init:yval, batch_size:batch, keep_prob:keep})



#If I run "threads", I get the error. It works up until here.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess,coord=coord)



#Don't know what works here or what doesn't.
for i in range(iterations):
    if i % 10 == 0:
        summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False))
        test_writer.add_summary(summary, i)
        print('Accuracy at step %s: %s' % (i, acc))
    else:
        if i % 100 == 99:
            run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
            run_metadata = tf.RunMetadata()
            summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True), options=run_options, run_metadata=run_metadata)
            train_writer.add_run_metadata(run_metadata, 'step%03d' % i)
            train_writer.add_summary(summary, i)
            print('Adding run metadata for', i)
        else:  # Record a summary
            summary, _ = sess.run([merged, train_step],feed_dict=feed_dict(True))
            train_writer.add_summary(summary, i)
coord.request_stop()  
train_writer.close()
test_writer.close()
sess.close()

Are you trying to building input pipeline ?? as shown here : http://stackoverflow.com/questions/37126108/how-to-read-data-into-tensorflow-batches-from-example-queue?rq=1 — Harsha Pokkalla, May 10 '17 at 19:52
The basic pipeline itself is definitely one aspect of the puzzle, but I think I also have the additional challenge of changing between two datasets...the training and validation set. I'll look over this in the meantime though and see if it helps, thanks! =) — Wanna-be Coder, May 10 '17 at 20:00
Usually, I build parallel input pipeline graphs for Training and Validation. Model is shared as we are using feed_dict. You could do sess.run(test_batch) and feed it into model using feed_dict. — Harsha Pokkalla, May 10 '17 at 20:05
I appreciate your help, but I think I need to understand the basics first, and I think that part will follow more easily. Your link helps a little, but still hasn't clarified how...or maybe more importantly where...the queue is incorporated. Should I think of the queueing/batching as a separate entity/function from the graph itself? The way I have it now, the batching is encoded into the graph as part of the "input", and I keep putting the original whole dataset in, which I know has to be wrong. I'm not sure WHERE to put the queue and do the batching within the session. — Wanna-be Coder, May 11 '17 at 13:02

Tensorflow: proper queueing/batching structure using training and validation set

0 Answers0