3

TensorFlow fails pretty hard when something isn't right.

Say we write files to a csv somewhat continuously. How can we check beforehand if a jpeg is valid before decode_jpeg fails? We could have a file where the jpeg hasn't been fully written or some other failure.

I've tried:

image_path, label = tf.decode_csv(value, field_delim=" ", record_defaults=record_defaults) print(image_path) #Tensor not a file... file = tf.read_file("some_fake_file.jpg") size = tf.size(file) #<tf.Tensor 'Size:0' shape=() dtype=int32> if size is not None: print("size is not none") # prints...

Doing something like

size = os.path.getsize(image_path)

throws: TypeError: coercing to Unicode: need string or buffer, Tensor found

So the question is how can we ensure we're dealing with a specified-size jpeg before passing it to decode_jpeg so TensorFlow doesn't fail.

Ideally we would be able to check the size and make sure the jpeg's size equals our full training image size.

JohnAllen
  • 7,317
  • 9
  • 41
  • 65
  • Just clarifying you mean the image file size? Could you not use [`os.stat()`](http://stackoverflow.com/questions/2104080/how-to-check-file-size-in-python)? Sorry if I misunderstand. – aug May 11 '16 at 22:12
  • Edited with more code. `os.stat()` doesn't work on what TF returns. – JohnAllen May 11 '16 at 22:24
  • 1
    I guess the easiest solution is to have another python script that reads from the input streaming csv, cleans it up and then writes to the final csv which tensorflow then reads – fabrizioM May 11 '16 at 23:07

0 Answers0