5

I downloaded a random image from the internet, opened it using PIL.Image.open() and cv2.imread() then I checked some pixels' values. The problem is that I got different values for the same pixels using PIL and Opencv!
This is the image I tried:
enter image description here This is what I did:

>>> import cv2
>>> from PIL import Image
>>> img = cv2.imread('img.jpg')
>>> im = Image.open('img.jpg')
>>> img[0][0]
>>> array([142, 152, 146], dtype=uint8)
>>> im.getpixel((0, 0))
>>> (138, 158, 131)

The R, G, B values ((138 != 146), (158 != 152), (131 != 142)) of both im and img don't match, although it is the same pixel and the same image!
I looked into SO posts, I found this post talking about the same issue, so I used the code that was posted to check the difference again:

from PIL import Image
import cv2
import sys
from hashlib import md5
import numpy as np

def hashIm(im):
    imP = np.array(Image.open(im))

    # Convert to BGR and drop alpha channel if it exists
    imP = imP[..., 2::-1]
    # Make the array contiguous again
    imP = np.array(imP)
    im = cv2.imread(im)

    diff = im.astype(int)-imP.astype(int)

    cv2.imshow('cv2', im)
    cv2.imshow('PIL', imP)
    cv2.imshow('diff', np.abs(diff).astype(np.uint8))
    cv2.imshow('diff_overflow', diff.astype(np.uint8))

    with open('dist.csv', 'w') as outfile:
        diff = im-imP
        for i in range(-256, 256):
            outfile.write('{},{}\n'.format(i, np.count_nonzero(diff==i)))

    cv2.waitKey(0)
    cv2.destroyAllWindows()

    return md5(im).hexdigest() + '   ' + md5(imP).hexdigest()

if __name__ == '__main__':
    print(hashIm('img.jpg'))

The hashes I got are different, also the difference between the images is not black!

Additional info:
- Os: Ubuntu 18.04
- Python: 3.6
- Opencv: opencv-python==4.0.0.21
- PIL: Pillow==5.4.1

Is there any explanation for this?

singrium
  • 2,746
  • 5
  • 32
  • 45
  • What about the provided answer regarding the JPG decoding? – HansHirse May 22 '19 at 10:43
  • 1
    as I know `OpenCV` uses BGR instead of RGB – furas May 22 '19 at 10:46
  • @furas, yes I am aware of this, and I compared the R, G, B channels for both outputs, so I am not confused about BGR from opencv and RGB from Pillow – singrium May 22 '19 at 10:47
  • I tested your first version (without converting BGR to RGB) with JPG and PNG and both give me correct values. Linux Mint 19.1 (based on Ubuntu 18.04), Python 3.7, OpenCV 4.1.0, PIL 5.4.1 – furas May 22 '19 at 10:50
  • @HansHirse this is what I got for Opencv: JPEG: /opt/libjpeg-turbo/lib64/libjpeg.a (ver 62). As for PIL, I didn't know how to check which libjpeg version it uses – singrium May 22 '19 at 10:53
  • @furas, do you think it is due to the packages versions? That could be.. – singrium May 22 '19 at 10:55
  • 1
    I tested with your image and it gives different values. It seems problem with some kind of images. Different is 1-2 points in every channel. – furas May 22 '19 at 10:57
  • @furas, yeah this is what I get too – singrium May 22 '19 at 11:04
  • 1
    as I know JPG algorithm uses Fourier transform or something similar and maybe it uses float values which can be rounded in different way in both modules. I wouldn't use JPG but PNG or TIFF which are lossless format. – furas May 22 '19 at 11:18
  • @furas, thank you. I'll work with PNG instead – singrium May 22 '19 at 11:20

2 Answers2

5

This is probably not a numpy / rounding problem but a jpg decoding variation : https://github.com/python-pillow/Pillow/issues/3833

In particular:

JPEG decoders are allowed to produce slightly different results, "a maximum of one bit of difference for each pixel component" according to Wikipedia.

(https://github.com/python-pillow/Pillow/issues/3833#issuecomment-585211263)

s.durand
  • 51
  • 1
  • 1
3

Opencv stores image as a numpy ndarray.

import cv2
cv_img = cv2.imread(img_path)
from PIL import Image
pil_img = Image.open(img_path)

When you do cv_img[x][y] you are accessing yth column and xth row, but if you do pil_img.getpixel((x,y)) pillow access pixels of xth column and yth row.

Another factor is pillow returns in (R, G, B) format where as opencv returns in (B, G, R) format.

For me cv_img[20][10] gives array([127, 117, 129], dtype=uint8). Check here B = 127, G = 117, R = 129.

But pil_img[10][20] gives (129, 117, 127). Check here R = 129, G = 117, B = 127.

Arkistarvh Kltzuonstev
  • 6,824
  • 7
  • 26
  • 56
  • 5
    `[0][0]` and `(0,0)` should get the same pixel - in both `x=0, y=0`. I have no problem to get correct values in my images JPG or PNG (after converting BGR to RGB) but image from question gives different values. Difference is 1-2 points in every channel. Maybe it uses float number and they are rounded in different way. – furas May 22 '19 at 11:03
  • @furas is right, [0][0] and (0, 0) should give the same result, also I tested your approach, but I still have difference values.. The difference is +/- 4 or +/- 6 at most for every channel. – singrium May 22 '19 at 11:06
  • I can reproduce the values from the given code. Additionally, `cv_img[0][0]`gives `[133, 145, 145]`, and `pil_img.getpixel((0, 0))` gives `(145, 145, 133)`, which is in accordance with the given answer. (Python 3.7, opencv-python 4.1.0.25, Pillow 6.0.0) – HansHirse May 22 '19 at 11:11
  • @HansHirse, so the problem is because of the libraries versions?! – singrium May 22 '19 at 11:15
  • 1
    @singrium Maybe, I don't know. I just wanted to support the given answer. Opening the provided image in ImageJ 1.52a and GIMP 2.10.6 also give the mentioned values for `x = 0`, `y = 0`. – HansHirse May 22 '19 at 11:24
  • I used Opencv 3.4.3.18 and PIL 5.3.0, still have different values. So if it is related to libraries, it should be related to libjpeg versions – singrium May 22 '19 at 11:32
  • 2
    I converted pil to `np.array(pil)`. pil.max() gives me 210, but cv.max() gives me 209. Confused... `cv.shape` (2448, 3264, 3) `pil.shape` (2448, 3264, 3) `cv = cv2.cvtColor(cv, cv2.COLOR_BGR2RGB)` `cv[1000][1000]` array([132, 32, 16], dtype=uint8) `pil[1000][1000]` array([132, 32, 16], dtype=uint8) `(cv==pil).all()` False `cv.min(), cv.max()` (0, 209) `pil.min(), pil.max()` (0, 210) – jl303 Jul 01 '19 at 12:28