5

I am working on a script with OpenCV (Python) to split up an image into different sections in order to run OCR on each section on it later. I've got the script splitting up the source image into all the boxes I need, but it also comes along with a number of plain white images as well.

I'm curious if there's a way to check if an image is only white pixels or not with OpenCV. I'm very new to this library, so any information on this would be helpful.

Thanks!

nathancy
  • 42,661
  • 14
  • 115
  • 137
jasonmerino
  • 3,220
  • 1
  • 21
  • 38

1 Answers1

18

Method #1: np.mean

Calculate the mean of the image. If it is equal to 255 then the image consists of all white pixels.

if np.mean(image) == 255:
    print('All white')
else:
    print('Not all white')

Method #2: cv2.countNonZero

You can use cv2.countNonZero to count non-zero (white) array elements. The idea is to obtain a binary image then check if the number of white pixels is equal to the area of the image. If it matches then the entire image consists of all white pixels. Here's a minimum example:


Input image #1 (invisible since background is white):

enter image description here

All white

Input image #2

enter image description here

Not all white

import cv2
import numpy as np

def all_white_pixels(image):
    '''Returns True if all white pixels or False if not all white'''
    H, W = image.shape[:2]
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

    pixels = cv2.countNonZero(thresh)
    return True if pixels == (H * W) else False

if __name__ == '__main__':
    image = cv2.imread('1.png')
    if all_white_pixels(image):
        print('All white')
    else:
        print('Not all white')
    cv2.imshow('image', image)
    cv2.waitKey()
nathancy
  • 42,661
  • 14
  • 115
  • 137
  • You get my vote just for posting an invisible image :-) – Mark Setchell Jan 15 '20 at 21:48
  • 1
    Unfortunately, your implementation regarding `cv2.threshold` is wrong. Set up an all gray image (any value > 0), and you'll always get `True`, which is due to the use of Otsu here. The correct implementation would be just `cv2.threshold(gray, 254, 255, cv2.THRESH_BINARY)[1]`, which is also way faster (even faster than `np.mean` or `np.all`)! Furthermore, Otsu limits `cv2.threshold` to single-channel images, as so does `cv2.countNonZero`. So, getting rid of Otsu and switching to `np.count_nonzero` will allow this approach to also work on multi-channel images. – HansHirse Jan 16 '20 at 08:15
  • One more addition: Unfortunately, `np.count_nonzero` is significantly slower than `cv2.countNonZero`. So, for multi-channel image support maybe a list comprehension of `cv2.countNonZero` calls is faster. – HansHirse Jan 16 '20 at 08:33
  • @HansHirse interesting I didn't know that. Thanks for pointing that out – nathancy Jan 16 '20 at 21:03