I was going through the CIFAR-10 example at TensorFlow getting started guide for CNN
Now in the train function in cifar10_train.py we get images as
images,labels = cifar10.distorted_inputs()
In the distorted_inputs()
function we generate the filenames in a queue and then read a single record as
# Create a queue that produces the filenames to read.
filename_queue = tf.train.string_input_producer(filenames)
# Read examples from files in the filename queue.
read_input = cifar10_input.read_cifar10(filename_queue)
reshaped_image = tf.cast(read_input.uint8image, tf.float32)
When I add debugging code, the read_input
variable contains only 1 record with an image and its height, width, and label name.
The example then applies some distortion to the read image/record and then passes it to the _generate_image_and_label_batch()
function.
This function then returns a 4D Tensor of shape [batch_size, 32, 32, 3]
where batch_size = 128
.
The above function utilizes the tf.train.shuffle_batch()
function when returns the batch.
My question is where do the extra records come from in the tf.train.shuffle_batch()
function? We are not passing it any filename or reader object.
Can someone shed some light on how we go from 1 record to 128 records? I looked into the documentation but didn't understand.