7

I use PIL.Image.open and tf.image.decode_jpeg to parse image files to arrays. But found that the pixel values in PIL.Image.open() are not the same as tf.image.decode_jpeg. Why this happens?

Thanks !

CodeOutput:

tf 100 100 [132 145 161]
pil 100 100 [134 147 164]

MyCode:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from datetime import datetime
import math
import time

import numpy as np
import tensorflow as tf

def decode_jpeg(image_file):
  from PIL import Image
  im = Image.open(image_file)
  data = np.array(im)
  return data

def tfimageread(filenames):
  filename_queue = tf.train.string_input_producer(filenames)
  reader = tf.WholeFileReader(name='image_reader')
  key, value = reader.read(filename_queue)
  uint8image = tf.image.decode_jpeg(value, channels=3)

  with tf.Session() as sess:
    coord = tf.train.Coordinator()
    threads = []
    for qr in tf.get_collection(tf.GraphKeys.QUEUE_RUNNERS):
      threads.extend(qr.create_threads(sess, coord=coord, daemon=True, start=True))
    image = sess.run(uint8image)
    coord.request_stop()
    coord.join(threads, stop_grace_period_secs=10)
    return image

if __name__ == '__main__':
  image_file = '。/o_1bchv9keu625336862107241874241888.jpg'
  image_tf = tfimageread([image_file])
  image_pil = decode_jpeg(image_file)
  i, j = 100, 100
  print ("tf %d %d %s" % (i,j,image_tf[i][j]))
  print ("pil %d %d %s" % (i,j,image_pil[i][j]))
xiusir
  • 101
  • 6
  • A wild guess - are you sure both modules use the same coordinate system? – Błotosmętek Jun 13 '17 at 07:39
  • Thank you and Yes, same coord. @Błotosmętek – xiusir Jun 13 '17 at 07:46
  • 1
    Maybe Different decompression algorithm in PIL, tf, and opencv with jpeg images ? [Same question](https://stackoverflow.com/questions/31607731/opencv-vs-matlab-different-values-on-pixels-with-imread) | [Opencv issue](http://code.opencv.org/issues/4148) – xiusir Jun 13 '17 at 07:49
  • Probably same problem as with [OpenCV and Pillow](https://stackoverflow.com/a/51121924/5407270). You need to make sure that PIL/Pillow uses same version of libjpeg as TensorFlow. – Andriy Makukha Jul 01 '18 at 09:18

1 Answers1

5

A common cause of this problem is that tensorflow attempts to take shortcuts when decompressing jpegs. This offers a pretty large speedup for image reading, which can be the bottleneck for training certain CNNs, but does jitter the pixel values a bit.

Luckily, the developers have exposed options to turn off some of these efficiencies. In particular, check out the argument dct_method.

Try changing your call to tf.image.decode_jpeg to:

tf.image.decode_jpeg(value, channels=3, dct_method='INTEGER_ACCURATE')

You may also need to mess with fancy_upscaling depending on the sorts of images you're reading and other things going on in the underlying version of libjpeg that your software is using.

charleslparker
  • 1,904
  • 1
  • 21
  • 31