8

I was going through the CIFAR-10 example at TensorFlow getting started guide for CNN

Now in the train function in cifar10_train.py we get images as

images,labels = cifar10.distorted_inputs()

In the distorted_inputs() function we generate the filenames in a queue and then read a single record as

 # Create a queue that produces the filenames to read.
 filename_queue = tf.train.string_input_producer(filenames)

 # Read examples from files in the filename queue.
 read_input = cifar10_input.read_cifar10(filename_queue)
 reshaped_image = tf.cast(read_input.uint8image, tf.float32)

When I add debugging code, the read_input variable contains only 1 record with an image and its height, width, and label name.

The example then applies some distortion to the read image/record and then passes it to the _generate_image_and_label_batch() function.

This function then returns a 4D Tensor of shape [batch_size, 32, 32, 3] where batch_size = 128.

The above function utilizes the tf.train.shuffle_batch() function when returns the batch.

My question is where do the extra records come from in the tf.train.shuffle_batch() function? We are not passing it any filename or reader object.

Can someone shed some light on how we go from 1 record to 128 records? I looked into the documentation but didn't understand.

Engineero
  • 12,340
  • 5
  • 53
  • 75
t0mkaka
  • 1,843
  • 2
  • 17
  • 21

1 Answers1

7

The tf.train.shuffle_batch() function can be used to produce (one or more) tensors containing a batch of inputs. Internally, tf.train.shuffle_batch() creates a tf.RandomShuffleQueue, on which it calls q.enqueue() with the image and label tensors to enqueue a single element (image-label pair). It then returns the result of q.dequeue_many(batch_size), which concatenates batch_size randomly selected elements (image-label pairs) into a batch of images and a batch of labels.

Note that, although it looks from the code like read_input and filename_queue have a functional relationship, there is an additional wrinkle. Simply evaluating the result of tf.train.shuffle_batch() will block forever, because no elements have been added to the internal queue. To simplify this, when you call tf.train.shuffle_batch(), TensorFlow will add a QueueRunner to an internal collection in the graph. A later call to tf.train.start_queue_runners() (e.g. here in cifar10_train.py) will start a thread that adds elements to the queue, and enables training to proceed. The Threading and Queues HOWTO has more information on how this works.

mrry
  • 125,488
  • 26
  • 399
  • 400
  • Thanks, This cleared up the things a lot. So the queuing works like you first create a flow of how things will go on and then you just say GO to threads and they will start running and fetching and working on the data whether from the filenames, records or any place else. Am I thinking right. – t0mkaka Jan 05 '16 at 04:31
  • That's pretty much all there is to it. (The queue runners also have some support for shutting things down cleanly, by propagating `close()` calls when a `tf.OutOfRangeError` is raised.) – mrry Jan 05 '16 at 04:34
  • This is a great topic, but I'm still confused. How does tf.train.start_queue_runners() or tf.train.shuffle_batch() know how to read the files in the queue and how to transform them? It does not seem that the distorted_inputs() function ever gets called again - only once to generate an example image. I ask because I modified the code to read from PNG files and a .txt with labels. It seems to read the images sequentially, but only reads the first label. – Rob Mar 15 '16 at 03:43
  • 4
    The `distorted_inputs()` function is a little subtle: it returns two symbolic tensors (`images` and `labels` that take a different value each time they are evaluated). Therefore, although the function is only called once, if you run multiple steps (for example, by calling `sess.run([images, labels])`, or by passing them to an op that uses a queue runner like `tf.train.shuffle_batch()`) they will fetch subsequent records from the file(s). – mrry Mar 15 '16 at 05:52
  • How exactly do they take different values, and how is it that the transformations get applied each time? I have put the labels in a different file, how can I get the file incorporated into this process? – Rob Mar 16 '16 at 03:14
  • 1
    The [reading from files HOWTO](https://www.tensorflow.org/versions/r0.7/how_tos/reading_data/index.html#reading-from-files) has some details about how preprocessing works. The support for joining two files is somewhat limited, but you might be able to achieve what you need with the [`tf.train.shuffle_batch_join()`](https://www.tensorflow.org/versions/r0.7/api_docs/python/io_ops.html#shuffle_batch_join) function. – mrry Mar 16 '16 at 04:55
  • I found an article you wrote that I found very useful here: http://stackoverflow.com/questions/34340489/tensorflow-read-images-with-labels Thank you very much for all your contributions in the community. Just to clarify: any operation that I perform after tf.train.string_input_producer such as tf.image.decode_png and tf.random_crop until tf.train.shuffle_batch will be repeated on every sess.run? It seems strange that we are not explicitly telling tensorflow to run each of these operations in sequence. – Rob Mar 16 '16 at 19:14
  • 1
    It's slightly more confusing than that - when you start the "queue runners", additional threads will be created that call `sess.run(enqueue_op)` for each of the queue runners. It's *these* calls to `sess.run()` (and *not* the ones in your own code) that cause the ops up to `tf.train.shuffle_batch()` to execute. Sorry if this sounds complicated: we're trying to find ways to simplify all of this! – mrry Mar 16 '16 at 20:54
  • 1
    Ok it's starting to make more sense, I successfully set up training and testing last night! (with 98% accuracy!). I'm sure it's difficult to balance performance and ease of use. So far tensorflow is miles above any of the other DNN frameworks out there, thank you for your hard work. – Rob Mar 17 '16 at 14:11