4

I have data in a tensorflow record file (data.record), and I seem to be able to read that data. I want to do something simple: just display the (png-encoded) image for a given example. But I can't get the image as a numpy array and simply show it. I mean, the data are in there how hard can it be to just pull it out and show it? I imagine I am missing something really obvious.

height = 700 # Image height
width = 500 # Image width

file_path = r'/home/train.record'
with tf.Session() as sess:
    feature = {'image/encoded': tf.FixedLenFeature([], tf.string),
               'image/object/class/label': tf.FixedLenFeature([], tf.int64)}
    filename_queue = tf.train.string_input_producer([data_path], num_epochs=1)
    reader = tf.TFRecordReader()
    _, serialized_example = reader.read(filename_queue)
    parsed_example = tf.parse_single_example(serialized_example, features=feature)
    image_raw = parsed_example['image/encoded']
    image = tf.decode_raw(image_raw, tf.uint8)
    image = tf.cast(image, tf.float32)
    image = tf.reshape(image, (height, width))

This seems to have extracted an image from train.record, with the right dimensions, but it is of type tensorflow.python.framework.ops.Tensor, and when I try to plot it with something like:

cv2.imshow("image", image)

I just get an error: TypeError: Expected cv::UMat for argument 'mat'.

I have tried using eval, as recommended at a link below:

array = image.eval(session = sess)

But it did not work. The program just hangs at that point (for instance if I put it after the last line above).

More generally, it seems I am just missing something, for even when I try to get the class label:

label = parsed_example['label']

I get the same thing: not the value, but an object of type tensorflow.python.framework.ops.Tensor. I can literally see the value is there when I type the name in my ipython notebook, but am not sure how to access it as an int (or whatever).

Note I tried this, which has some methods that seem to directly convert to a numpy array but they did not work: https://github.com/yinguobing/tfrecord_utility/blob/master/view_record.py

I just got the error there is no numpy method for a tensor object.

Note I am using tensorflow 1.13, Python 3.7, working in Ubuntu 18. I get the same results whether I run from Spyder or the command line.

Related questions
- How to print the value of a Tensor object in TensorFlow?
- https://github.com/aymericdamien/TensorFlow-Examples/issues/40

eric
  • 7,142
  • 12
  • 72
  • 138

3 Answers3

1
import tensorflow as tf


with tf.Session() as sess:
  r  = tf.random.uniform([10, 10])
  print(type(r))
  # <class 'tensorflow.python.framework.ops.Tensor'>
  a = r.eval()
  print(type(a))
  # <class 'numpy.ndarray'>

I could not reproduce your exact case. But, you need to evaluate Tensor to NumPy NDArray. As far as I understand, this is not an issue with TensorRecord. Colab link for the code.

Sıddık Açıl
  • 957
  • 8
  • 18
  • This is a helpful answer as it shows, in a general way, how to use eval() to extract a value from a tf tensor type. I ended up the resources in rvinas answer to reconstruct the original image, but this will be helpful for pulling other values like width/height etc.. – eric Jul 08 '19 at 13:07
1

To visualize a single image from the TFRecord file, you could do something along the lines of:

import tensorflow as tf
import matplotlib.pyplot as plt

def parse_fn(data_record):
    feature = {'image/encoded': tf.FixedLenFeature([], tf.string),
               'image/object/class/label': tf.FixedLenFeature([], tf.int64)}
    sample = tf.parse_single_example(data_record, feature)
    return sample

file_path = r'/home/train.record'
dataset = tf.data.TFRecordDataset([file_path])
record_iterator = dataset.make_one_shot_iterator().get_next()

with tf.Session() as sess:
    # Read and parse record
    parsed_example = parse_fn(record_iterator)

    # Decode image and get numpy array
    encoded_image = parsed_example['image/encoded']
    decoded_image = tf.image.decode_jpeg(encoded_image, channels=3)
    image_np = sess.run(decoded_image)

    # Display image
    plt.imshow(image_np)
    plt.show()

This assumes that the image is JPEG-encoded. You should use the appropriate decoding function (e.g. for PNG images, use tf.image.decode_png).

NOTE: Not tested.

rvinas
  • 11,824
  • 36
  • 58
  • 1
    This works great! Since I am working with greyscale images I used channels = 0, and had to squeeze the last dimension out, but it works great. My one worry is `tf_record_iterator` is deprecated, so they are now recommending use `TFRecordDataset`, but for now this seems good. – eric Jul 08 '19 at 13:10
  • 1
    I'm glad I could help :) I updated my answer to use `tf.data.TFRecordDataset` instead. – rvinas Jul 08 '19 at 13:16
  • Oh very cool, though one issue: the line `next(record_iterator)` throws `TypeError: 'Tensor' object is not an iterator`. – eric Jul 08 '19 at 13:36
  • 1
    Note that I changed this in my updated code: `parsed_example = parse_fn(record_iterator)` – rvinas Jul 08 '19 at 13:41
  • Oops -- fixed. This seems good, though now when I try to pull additional data from that record (e.g., `width = parsed_example['image/width'])`, it seems to be moving on to the next step in the iterator, so for instance I get width from the next record in the iterator, not the image just plotted. Or I get an error if I try to pull all the info from a tfrecord that has info about just a single image (exampe: http://pasted.co/2df8ddf5). – eric Jul 08 '19 at 14:25
  • 1
    This happens because you call `eval` multiple times. To get all the features, just do: `image_np, width, height = sess.run([decoded_image, parsed_example['image/width'], parsed_example['image/height']])` – rvinas Jul 08 '19 at 14:31
  • 1
    Wow thanks I have learned so much from this answer -- I am obviously fairly new to tensorflow and this is helping me understand what is going on... – eric Jul 08 '19 at 14:49
0

Try tfrmaker, a tensorflow TFRrecord utility package. You can install the package with pip:

pip install tfrmaker

Then you could visualize the batches of your dataset like this example:

import os
from tfrmaker import images, display

# mapping label names with integer encoding.
LABELS = {"bishop": 0, "knight": 1, "pawn": 2, "queen": 3, "rook": 4}

# directory contains tfrecords
TFR_DIR = "tfrecords/chess/"

# fetch all tfrecords in the directory to a list
tfr_paths = [os.path.join(TFR_DIR,file) for file in os.listdir(TFR_DIR) if os.fsdecode(file).endswith(".tfrecord")]

# load one or more tfrecords as an iterator object.
dataset = images.load(tfr_paths, batch_size = 32)

# iterate one batch and visualize it along with labels.
databatch = next(iter(dataset))
display.batch(databatch, LABELS)

The package also has some cool features like:

  • dynamic resizing
  • splitting tfrecords into optimal shards
  • spliting training, validation, testing of tfrecords
  • count no of images in tfrecords
  • asynchronous tfrecord creation

NOTE: This package currently supports image datasets that are organised as directories with class names as sub directory names.