I was trying to perform some data augmentation in object detection models in tensorflow so I was checking the compatibility of different image representations.
First I was just reading an image file using PIL
(Pillow
to be precise)
full_path = 'path/to/my/image.jpg'
image = PIL.Image.open(full_path)
image_np = np.array(image)
encoded_jpg_io1 = io.BytesIO(image_np)
Then I used the tensorflow version (used to create tfrecords as well):
with tf.gfile.GFile(full_path, 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io2 = io.BytesIO(encoded_jpg)
And then I checked the equality of the above operations:
if encoded_jpg_io1 == encoded_jpg_io2:
print('Equal')
I was expecting those two to be equal. So, why this is not the case here?
If I use the bytes I get the same result:
v1 = encoded_jpg_io1.getvalue()
v2 = encoded_jpg_io2.getvalue()
if encoded_jpg_io1.getvalue() == encoded_jpg_io2.getvalue():
print('Equal')
if v1.__eq__(v2):
print('Equal')
I need to manipulate my images with numpy and then create some tfrecords so the equality is required.
Some interesting facts:
1. PIL
cannot read the image in np.array
format at all:
image1 = PIL.Image.open(encoded_jpg_io1)
OSError: cannot identify image file
While using GFile
works fine:
image2 = PIL.Image.open(encoded_jpg_io2)
2.PIL
image cannot be directly converted to BytesIO
:
encoded_jpg_io1 = io.BytesIO(image)
TypeError: a bytes-like object is required, not 'JpegImageFile'