0

How do I know if I have read an PNG image in YCbCr mode correctly? I'm getting different pixel values which is confusing.

def convert_rgb_to_ycbcr(img):
    y = 16. + (64.738 * img[:, :, 0] + 129.057 * img[:, :, 1] + 25.064 * img[:, :, 2]) / 255.
    cb = 128. + (-37.945 * img[:, :, 0] - 74.494 * img[:, :, 1] + 112.439 * img[:, :, 2]) / 255.
    cr = 128. + (112.439 * img[:, :, 0] - 94.154 * img[:, :, 1] - 18.285 * img[:, :, 2]) / 255.
    return np.array([y, cb, cr]).transpose([1, 2, 0])


# method 1 - read as YCbCr directly
img = scipy.misc.imread(path, mode='YCbCr').astype(np.float)
print(img[0, :5, 0]) 
# returns [32. 45. 68. 78. 92.]

# method 2 - read as RGB and convert RGB to YCbCr
img = scipy.misc.imread(path, mode='RGB').astype(np.float)
img = convert_rgb_to_ycbcr(img)
print(img[0, :5, 0]) 
# returns[44.0082902  55.04281961 75.1105098  83.57022745 95.44837255]

I want to use method 1 as scipy already takes care of the conversion for me but I wasn't able to find the source code for it. So I defined the conversion function myself but I'm getting different pixel values.

Steven Chen
  • 397
  • 1
  • 6
  • 19

1 Answers1

1

In the latest scipy versions, imread is deprecated. However, it uses Image.convert from PIL to convert the modes.

Details:

https://pillow.readthedocs.io/en/3.1.x/reference/Image.html?highlight=convert#PIL.Image.Image.convert

https://pillow.readthedocs.io/en/3.1.x/handbook/concepts.html#concept-modes

https://github.com/scipy/scipy/blob/v0.18.0/scipy/misc/pilutil.py#L103-L155

I changed your convert_rgb_to_ycbcr(img) function and it gives the same result.

Implementation used: https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-rdprfx/b550d1b5-f7d9-4a0c-9141-b3dca9d7f525?redirectedfrom=MSDN

Conversion formula from RGB to YCbCr

enter image description here

enter image description here

import scipy.misc # scipy 1.1.0
import numpy as np

def convert_rgb_to_ycbcr(im):
    xform = np.array([[.299, .587, .114], [-.1687, -.3313, .5], [.5, -.4187, -.0813]])
    ycbcr = im.dot(xform.T)
    ycbcr[:,:,[1,2]] += 128
    return np.uint8(ycbcr)


# method 1 - read as YCbCr directly
img = scipy.misc.imread('test.jpg', mode='YCbCr').astype(np.float)
print(img[0, :5, 0]) 
# returns [32. 45. 68. 78. 92.]

# method 2 - read as RGB and convert RGB to YCbCr
img = scipy.misc.imread('test.jpg', mode='RGB').astype(np.float)
img = convert_rgb_to_ycbcr(img)
print(img[0, :5, 0]) 
[165. 165. 165. 166. 167.]
[165 165 165 166 167]

Zabir Al Nazi
  • 10,298
  • 4
  • 33
  • 60
  • Thank you for the function. Is there a reference to this way of calculation? Also just two additional questions: Since scipy.misc.imread is now depreciated, what's the best way to read the images? Lastly there is a flatten parameter for imread and it supposes to flatten 3 channels to 1, how is that done? – Steven Chen May 03 '20 at 05:10
  • That's I'm not sure of, how do you define best? You can use opencv `imread`, Pillow `open`. Do you mean with `scipy`? For flatten, this may be useful: https://stackoverflow.com/questions/32314657/scipy-misc-imread-flatten-argument-converting-to-grey-scale – Zabir Al Nazi May 03 '20 at 05:19