2

I'm trying to modify the Advanced Convolutional Neural Networks tutorial of Tensorflow, which originally uses the CIFAR-10 dataset (https://www.tensorflow.org/tutorials/images/deep_cnn) to work with unsigned 16-bit integer data. I would read the data from a binary format, similar to the CIFAR-10 dataset.

The original code reads the unsigned 8-bit integer data from a binary format. I managed to create 8-bit samples from my own dataset with various dimensions (the result.depth variable below) and the training worked without any problem.

But my own data is an unsigned 16-bit integer dataset. It would loose important information if I would resample it into 8 bit. Therefore, I would like to feed the unsigned 16-bit data into the graph for training.

The first 2 byte contains the label for the data and the rest (height x width x depth x 2 bytes) contains the data for a record. I use this method to write the binary file to disc:

out = np.array(outp, dtype = np.uint16) #this variable contains the data
out.tofile("d:\\TF\\my_databatch_0.bin") 

This part tend to be OK. If I read it back to memory with this:

in = np.fromfile("d:\\TF\\my_databatch_0.bin", dtype=np.uint16)

It gives me exactly the same array, which was written to the disc. When I'm trying to feed it into the graph through this modified input function, it fails in a later step:

import tensorflow as tf
import numpy as np

##########################Reader for unsigned 16 bit numpy array###############
def read_cifar10(filename_queue):
  class CIFAR10Record(object):
    pass
  result = CIFAR10Record()

    label_bytes = 2  
    result.height = 32
    result.width = 32
    result.depth = 1
    result.bitdepth = 2 #2 if 16bit, 1 if 8bit
    image_bytes = result.height * result.width * result.depth * result.bitdepth
    record_bytes_ = label_bytes + image_bytes

    reader = tf.FixedLengthRecordReader(record_bytes=record_bytes_)
    result.key, value = reader.read(filename_queue)
    record_bytes__ = tf.decode_raw(value, tf.uint16, little_endian=True)

    result.label = tf.strided_slice(record_bytes__, [0], [label_bytes])
    data_img = tf.strided_slice(record_bytes__, [label_bytes], [label_bytes + image_bytes])
    depth_major = tf.reshape(data_img, [result.depth, result.height, result.width])
    result.uint16image = tf.transpose(depth_major, [1, 2, 0])

  return result

I did modifications to the network architecture, to accept various dimensional data (in the sample code below it was set to 1). With my 8 bit sample, it worked. The error which I'm getting with the 16-bit data:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Input to reshape is a tensor with 1023 values, but the requested shape has 1024
     [[Node: data_augmentation/Reshape = Reshape[T=DT_UINT16, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/device:CPU:0"](data_augmentation/StridedSlice_1, data_augmentation/Reshape/shape)]] 

It looks like I lost 1 or 2 bytes from my data during the read or the data argumentation part of the graph. My tensor is missing its last element.

This is the original input file, which I'm trying to modify: https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_input.py

I'm using Visual Studio 2017 with Python 3.5.2 on Windows 10. The version of Tensorflow is 1.9.0.

Innat
  • 16,113
  • 6
  • 53
  • 101
ibarton
  • 21
  • 3

1 Answers1

0

I came up with a solution, which used the TFRecords format (https://www.tensorflow.org/api_guides/python/python_io#tfrecords_format_details).

The solution was inspired by this thread: How do I convert a directory of jpeg images to TFRecords file in tensorflow?

The creation of the TFRecords dataset from the 4D image array ([IMG_ID][y][x][BAND_ID]) and the 1D label array was done with the cited convert_to(images, labels, name) function in the accepted answer.

The reading function was modified according to the recent API of Tensorflow:

def read_TFrecords(filename_queue):

  class CIFAR10Record(object):
    pass
  result = CIFAR10Record()
  item_type = tf.uint16
  label_bytes = item_type.size  
  result.height = 32
  result.width = 32
  result.depth = 1

  reader = tf.TFRecordReader()
  _, serialized_example = reader.read(filename_queue)
  features = tf.parse_single_example(
    serialized_example,
    features={
            'image_raw': tf.FixedLenFeature([], tf.string),
            'label': tf.FixedLenFeature([], tf.int64),
             })

  image = tf.decode_raw(features['image_raw'], item_type)

  image = tf.reshape(image, [result.depth, result.height, result.width])
  image.set_shape([result.depth, result.height, result.width])
  result.uint16image = tf.transpose(image, [1, 2, 0])# tf.cast(image, tf.float32)
  result.label = tf.cast(features['label'], tf.int32)

  return result

In the input of Advanced Convolutional Neural Networks tutorial of Tensorflow (https://github.com/tensorflow/models/blob/master/tutorials/image/cifar10/cifar10_input.py) further modification need to be done by removing the following line from the distorted_inputs(data_dir, batch_size) function:

read_input.label.set_shape([1])

Since it is already passed as a 1D tensor from the reader.

This way I could feed my own unsigned 16 bit integer data to the graph from the file queue.

ibarton
  • 21
  • 3