What is the difference of opening a mask image with PIL and cv2?

Question

Say, I am opening an image file which is a mask image of a specific 3-channels RGB image.

Above is the mask image (msk.png) I'm trying to use in my segmentation model.

When I open this file with PIL library in python with the following line of code :

img1 = Image.open('msk.png')

And after converting img1 to a numpy array and printing it, I get this array :

[[2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0
  0 0 0 0 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
  3 3 3 3 3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 3 3 3 3 3
  3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
  3 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
  4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4
... (truncated)
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]]

which seems good for my multiclass segmentation task. (it is a grayscale image, and the number of classes to be classified is 24.)

I expected the output array to be the same as the above array when the mask is read by cv2.imread.

However, in the case of img2 such as :

img2 = cv2.imread('msk.png')

It shows a 3-channel output, such as below :

[[[  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
  [  0 252 124]
...(truncated)
  [182  38 155]
  [182  38 155]
  [182  38 155]
  [182  38 155]]]

Why are the two outputs different from each other? Furthermore, how can I make an image to be shown exactly like img1, which has individual labels for each pixels of the grayscale image?

Mark Setchell · Accepted Answer · 2023-03-22T14:50:43.273

Your image is (sensibly and understandably) a palette image because it has a limited number of colours (classes) and as such a palette/indexed image is an efficient way of storing it.

You can see that with exiftool:

exiftool U3nGE.png

ExifTool Version Number         : 12.50
File Name                       : U3nGE.png
Directory                       : .
File Size                       : 12 kB
File Modification Date/Time     : 2023:03:22 13:47:33+00:00
File Access Date/Time           : 2023:03:22 13:47:34+00:00
...
...
Image Width                     : 960
Image Height                    : 736
Bit Depth                       : 4
Color Type                      : Palette             <--- HERE
Compression                     : Deflate/Inflate
...
...
Image Size                      : 960x736
Megapixels                      : 0.707

OpenCV is for computer vision and no cameras use palette images, so it doesn't allow you to access the indices and the palette - it just looks up the RGB values through the palette. I think you are more or less obliged to use PIL/Pillow or other library.

Note that you can find the 4 unique colours in your image like this:

import numpy as np
import cv2

im = cv2.imread('U3nGE.png')
colours, counts = np.unique(im.reshape(-1,3), axis=0, return_counts=True)

print(colours, counts)

which yields these 4 BGR colours:

array([[  0, 252, 124],
       [147,  20, 255],
       [169, 169, 169],
       [182,  38, 155]], dtype=uint8)

and their corresponding counts (or frequency of occurrence):

array([362286,  33747, 236590,  73937])

but you have lost the correlation between the colours and the class indices. If you know the number of pixels of each class I guess you could re-associate the colours with the classes.

If you use PIL/Pillow you can get the palette like this:

from PIL import Image
import numpy as np

im = Image.open('U3nGE.png')
palette = np.array(im.getpalette(),dtype=np.uint8).reshape((-1,3))

print(palette)

which yields numbers corresponding to what we saw with OpenCV above - except they are padded with greys and in RGB order unlike OpenCV's BGR order:

array([[155,  38, 182],       # palette entry 0
       [ 14, 135, 204],       # palette entry 1
       [124, 252,   0],       # palette entry 2
       [255,  20, 147],       # palette entry 3
       [169, 169, 169],       # palette entry 4
       [  5,   5,   5],       # grey padding
       [  6,   6,   6],       # grey padding
       [  7,   7,   7],
       [  8,   8,   8],
       [  9,   9,   9],
       [ 10,  10,  10],
       [ 11,  11,  11],
       [ 12,  12,  12],
       [ 13,  13,  13],
       [ 14,  14,  14],
       [ 15,  15,  15]], dtype=uint8)

What is the difference of opening a mask image with PIL and cv2?

1 Answers1