How to reverse foreground and background after binarizing image with cv2.threshold?

Question

I want to extract the move boxes from a chess scoresheet.

chess scoresheet

I followed slide 15 of the paper. I use the code to implement the steps described in the paper.

def preprocess(img):
    img_tmp = img

    img_tmp = cv2.cvtColor(img_tmp, cv2.COLOR_BGR2GRAY)

    otsu_threshold, img_tmp = cv2.threshold(img_tmp, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    
    kernel = np.ones((5,5), np.uint8)
    img_tmp = cv2.erode(img_tmp, kernel, iterations = 5)
    img_tmp = cv2.dilate(img_tmp, kernel, iterations = 5)

    contours, hierarchy = cv2.findContours(image = img_tmp, mode = cv2.RETR_TREE, method = cv2.CHAIN_APPROX_NONE)
    
    return (img_tmp, contours)

I don't get the same results as described in the paper. Through image processing, I can't get an image containing only gridlines. Instead, I get a preprocessed scoresheet that still contains irrelevant information.

preprocessed scoresheet

Garbage in garbage out so I also get an image with wrong contours.

I think the error lies in the binarization step. The foreground should be white and the background should be black. But I don't know how to change the parameters in cv2.threshold to achieve the desired outcome.

Edit: I used the images from the HCS dataset. I get an improved image. The approach from @Bilal improves the image7 further. Still the results are not as good as described in the paper. So to improve the results further removing the handwritten text before the contour detection is a good idea. The paper says

"Then we use two long, thin kernels (one horizontal and one vertical) with sizes relative to input image dimensions, and morphological operations (erosion followed by dilation) with those kernels to generate an image containing only grid lines."

I did use erosion followed by dilation but the image still contains the moves. So there is a mistake in the code.

The article could be an interesting starting point. But is there a more simple approach to the problem?

In cv2.threshold use the flag cv2.THRESH_BINARY_INV instead of cv2.THRESH_BINARY. Feel free to reac the docs. — Micka, Mar 05 '22 at 20:23
*they* use a proper scan, not a snapshot of a half-folded sheet of paper. I _am_ surprised their otsu doesn't snag any of the pencil marks. -- you need locally adaptive... something, thresholding or at least removal of the uneven lighting. — Christoph Rackwitz, Mar 05 '22 at 21:22
I suggest you look at [this answer](https://stackoverflow.com/a/54029190/7328782), it shows an algorithm, the path opening, that you could use to detect these lines that are not perfectly horizontal or vertical. — Cris Luengo, Mar 13 '22 at 21:17

Bilal · Answer 1 · 2022-03-06T06:46:40.973

The challenges in your input image are:

Contrast Variance.
Shape Deformation.

Adaptive thresholding proposed by @ChristophRackwitz handles contrast variance.

Due to shape deformation, some existent solutions might not be applicable to your case, hence "Long horizontal/vertical kernels", "Quadrilaterals selected by size" as mentioned in the paper is not straight forward approach for this image.

Plus "image containing only gridlines" is hard to obtain in your case because some handwritten text overlaps with the grid. Moreover, the external corners are not contained in the image to obtain them and apply hough line transform, or draw lines manually inside them to get a grid!

Contour Detection with Filtering might give you something like this:

#!/usr/bin/python3
# -*- coding: utf-8 -*-
import cv2
import numpy as np

# # Reference: 
# # https://stackoverflow.com/a/47057324
def lcc (image):
    image = image.astype('uint8')
    nb_components, output, stats, centroids = cv2.connectedComponentsWithStats(image, connectivity=8)
    sizes = stats[:, -1]

    max_label = 1
    max_size = sizes[1]
    for i in range(2, nb_components):
        if sizes[i] > max_size:
            max_label = i
            max_size = sizes[i]

    img2 = np.zeros(output.shape)
    img2[output == max_label] = 255
    img2 = img2.astype('uint8')
    return img2

def preprocess(img):
    img_tmp = img.copy()

    gray = cv2.cvtColor(img_tmp, cv2.COLOR_BGR2GRAY)
    th = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,151,19)
    
    fl = cv2.medianBlur(th, 7)

    kernel = np.ones((5,5), np.uint8)
    er = cv2.erode(fl, kernel, iterations = 5)
    dl = cv2.dilate(er, kernel, iterations = 5)
    dl = 255 - dl
    lc = lcc(dl)
    contours, hierarchy = cv2.findContours(image = lc, mode = cv2.RETR_TREE, method = cv2.CHAIN_APPROX_NONE)
    return (lc, contours)

img = cv2.imread('input.jpg')

res, c = preprocess(img)

cv2.drawContours(img, c, -1, (255, 0, 0), 3)

cv2.namedWindow("res", cv2.WINDOW_NORMAL)
cv2.imshow("res", img)
cv2.waitKey(0)

How to reverse foreground and background after binarizing image with cv2.threshold?

1 Answers1