How to convert a RGB image (3 channel) to grayscale (1 channel) and save it?

Question

Working with a deep learning project and I have a lot of images, that don't need to have colors. I saved them doing:

import matplotlib.pyplot as plt

plt.imsave('image.png', image, format='png', cmap='gray')

However later when I checked the shape of the image the result is:

import cv2
img_rgb = cv2.imread('image.png')
print(img_rgb.shape)
(196,256,3)

So even though the image I view is in grayscale, I still have 3 color channels. I realized I had to do some algebric operations in order to convert those 3 channels into 1 single channel.

I have tried the methods described on the thread "How can I convert an RGB image into grayscale in Python?" but I'm confused.

For example, when to do the conversion using:

from skimage import color
from skimage import io
img_gray = color.rgb2gray(io.imread('image.png'))
plt.imsave('image_gray.png', img_gray, format='png')

However when I load the new image and check its shape:

img_gr = cv2.imread('image_gray.png')
print(img_gr.shape)
(196,256,3)

I tried the other methods on that thread but the results are the same. My goal is to have images with a (196,256,1) shape, given how much less computationally intensive it will be for a Convolutional Neural Network.

Any help would be appreciated.

I have exactly the same problem. Did you find any solution to it? I used matplotlib and a manual method but both result in 3 channel grayscale image :| — Shilan, Nov 09 '20 at 15:10

jmsinusa · Answer 1 · 2018-10-10T19:41:56.937

Your first code block:

import matplotlib.pyplot as plt
plt.imsave('image.png', image, format='png', cmap='gray')

This is saving the image as RGB, because cmap='gray' is ignored when supplying RGB data to imsave (see pyplot docs).

You can convert your data into grayscale by taking the average of the three bands, either using color.rgb2gray as you have, or I tend to use numpy:

import numpy as np
from matplotlib import pyplot as plt
import cv2

img_rgb = np.random.rand(196,256,3)
print('RGB image shape:', img_rgb.shape)

img_gray = np.mean(img_rgb, axis=2)
print('Grayscale image shape:', img_gray.shape)

Output:

RGB image shape: (196, 256, 3)
Grayscale image shape: (196, 256)

img_gray is now the correct shape, however if you save it using plt.imsave, it will still write three bands, with R == G == B for each pixel. This is because, I believe, a PNG file requires three (or four) bands. Warning: I am not sure about this: I expect to be corrected.

plt.imsave('image_gray.png', img_gray, format='png')
new_img = cv2.imread('image_gray.png')
print('Loaded image shape:', new_img.shape)

Output:

Loaded image shape: (196, 256, 3)

One way to avoid this is to save the images as numpy files, or indeed to save a batch of images as numpy files:

np.save('np_image.npy', img_gray)
new_np = np.load('np_image.npy')
print('new_np shape:', new_np.shape)

Output:

new_np shape: (196, 256)

The other thing you could do is save the grayscale png (using imsave) but then only read in the first band:

finalimg = cv2.imread('image_gray.png',0)
print('finalimg image shape:', finalimg.shape)

Output:

finalimg image shape: (196, 256)

PNG does not require three bands. RBGA is very common, and grayscale or grayscale with alpha is also possible. — Håken Lid, Oct 10 '18 at 19:31
@HåkenLid I'd appreciate an example of how to save just one band to a PNG. — jmsinusa, Oct 10 '18 at 19:40
The official PNG spec supports it. https://www.w3.org/TR/2003/REC-PNG-20031110/ "Indexed-color, grayscale, and truecolor images are supported, plus an optional alpha channel. Sample depths range from 1 to 16 bits.". But it's entirely possible that there are graphics libraries that don't fully support all parts of the PNG standard. — Håken Lid, Oct 10 '18 at 19:46
Thank you this is very useful! I think a good course of action to take would then be: 1 - to load the images as they are (3 color channel) 2 - load the data and convert it to 1 color channel 3 - save it as numpy file 4 - feed that into the neural network The only problem I can anticipate with this is that I think the expected input will be (196,256,1) and the one I will have is a (196,256). Are these equivalent? — J. Devez, Oct 12 '18 at 23:20
You can use numpy.expand_dims(array, 2) on your input to add the extra expected dimension. — jmsinusa, Oct 13 '18 at 01:11

score 3 · Answer 2 · answered Oct 16 '18 at 21:42

3

As it turns out, Keras, the deep-learning library I'm using has its own method of converting images to a single color channel (grayscale) in its image pre-processing step.

When using the ImageDataGenerator class the flow_from_directory method takes the color_mode argument. Setting color_mode = "grayscale" will automatically convert the PNG into a single color channel!

https://keras.io/preprocessing/image/#imagedatagenerator-methods

Hope this helps someone in the future.

answered Oct 16 '18 at 21:42

J. Devez

329
2
6
15

The problem is I am not reading from disk! I am taking image from camera and I want to convert it immediately to grayscale without saving it on disk! – Shilan Nov 09 '20 at 15:12

score 0 · Answer 3 · answered Sep 20 '19 at 07:04

if you want to just add extra channels that have the same value as the graysacale , maybe to use a specific model that requires 3 channel input_shape .

lets say your pictures are 28 X 28 and so you have a shape of (28 , 28 , 1) def add_extra_channels_to_pic(pic):

if pic.shape == (28 , 28 , 1):
    pic = pic.reshape(28,28)
    pic =  np.array([pic , pic , pic])
    # to make the channel axis in the end
    pic = np.moveaxis(pic , 0 , -1) 
    return pic

score 0 · Answer 4 · answered Jun 01 '20 at 09:17

0

Try this method

import imageio
new_data = imageio.imread("file_path", as_gray =True)
imageio.imsave("file_path", new_data)

The optional argument "as_gray = True" in line 2 of the code does the actual conversion.

answered Jun 01 '20 at 09:17

Akinyemi Olayinka Atanda

1

How to convert a RGB image (3 channel) to grayscale (1 channel) and save it?

4 Answers4