There's no bug, but maybe a misunderstanding how the different modes in Pillow work.
As you found out, the image in question is a palletised image. But, by explicitly converting the image to RGBA
mode, all the information from the original palette and transparency are "processed", such that when convertig to some NumPy array, you will only see the colors taken from the palette, and the alpha channel extracted.
If you open the image without any converting, mode P
(or maybe PA
) will be taken automatically, and some extracted NumPy array will only have one channel. Let's see the following example:
from matplotlib import pyplot as plt
import numpy as np
from PIL import Image
plt.figure(1, figsize=(10, 9))
# Read image with Pillow, explicit RGBA mode
image_pil = Image.open('grin-emoji-by-twitter.png').convert('RGBA')
plt.subplot(2, 2, 1), plt.imshow(image_pil), plt.title('Pillow image; explicit RGBA mode')
image_np = np.array(image_pil)
plt.subplot(2, 2, 2), plt.imshow(image_np), plt.title('NumPy array')
print("Image.open(...).convert('RGBA'):")
print(image_np.shape, image_np.dtype, '\n')
# Palette of Pillow image
print("Palette:")
print(image_pil.getpalette(), '\n')
# Image info of Pillow image
print("Information:")
print(image_pil.info, '\n')
# Read image with Pillow, no mode set, P mode is taken implicitly
image_pil = Image.open('grin-emoji-by-twitter.png')
plt.subplot(2, 2, 3), plt.imshow(image_pil), plt.title('Pillow image; implicit P mode')
image_np = np.array(image_pil)
plt.subplot(2, 2, 4), plt.imshow(image_np), plt.title('NumPy array')
print("Image.open(...):")
print(image_np.shape, image_np.dtype, '\n')
# Palette of Pillow image
print("Palette:")
print(image_pil.getpalette(), '\n')
# Image info of Pillow image
print("Information:")
print(image_pil.info)
plt.tight_layout()
plt.show()
That's the image output:

As you can see, the Pillow image is the same for both modes, but the extracted NumPy arrays differ (four channel RGBA vs. single channel).
Let's have a further look at the print
outputs:
Image.open(...).convert('RGBA'):
(512, 512, 4) uint8
Palette:
None
Information:
{}
Image.open(...):
(512, 512) uint8
Palette:
[71, 112, 76, 255, 202, 78, ... ]
Information:
{'transparency': b"\x00\x0e\x02\..."}
We again see the difference in the NumPy arrays. But, you can also see, that the palette and transparency information are no longer stored as meta data for the explicitly RGBA
converted Pillow image, but encoded into the pixel values themselves, whereas they're maintained when loaded with mode P
. Also, you see that [71, 112, 76]
(olive green) is used for all 0
pixel values, which is the background. (Why that color was chosen, is another question.)
So, depending on what you want to achieve with the extracted NumPy array, use mode P
when loading the image with Pillow.
Hope that helps!
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.8.1
Matplotlib: 3.2.0rc1
NumPy: 1.18.1
Pillow: 7.0.0
----------------------------------------