3

I'm working on a project that lets user take photos of handwritten formulas and send them to my server. I want to leave only symbols related to mathematics, not sheet grid.

Sample photo:

(1) Original RGB photo Original photo (RGB

(2) Blurred Grayscale Blurred grayscale photo

(3) After applying Adaptive Threshold After applying Adaptive Threshold

NOTE: I expect my algorithm to deal with sheet grid of any color.

Any code snippets will be greatly appreciated. Thanks in advance.

Xai Nano
  • 89
  • 3
  • 9

3 Answers3

5

Result

This is a challenging problem to generalize without knowing exactly what kind of paper/lines and ink combination to expect, and what exactly the output will be used for. I'd thought I'd attempt it and maybe learn something.

I see two ways to approach this problem:

  1. The clever way: identify the grid, its color, orientation, size to find the regions of the image occupied by it, in order to ignore it. There are major caveats here that would need to be addressed. e.g. the page may not be photographed flat and squared (warp, distortion, rotation have to accounted for). There will also be lines that we don't want removed.

  2. The simple way: Apply general image manipulations, knowing little about the problem other than the assumptions that the pen is always darker than the grid, and the output is to be binary (black pen / white page).

I like the second one better because it is easier to implement and generalizes better.

We first notice that the "white" of the page is actually a non-uniform shade of grey (if we convert to grayscale). The CV adaptive thresholding deals with this nicely. It almost gets us there.

The code below treats the image in 50x50 pixel blocks to address the non-uniformity of lighting. In each block, we subtract the median before applying a threshold. A simple solution, but maybe what you need. I haven't tested it on many images and the threshold and pre- and post-processing may need tweaking. It will not work if input images vary significantly, or if the grid is too dark relative to the ink.

import cv2
import numpy
import sys

BLOCK_SIZE = 50
THRESHOLD = 25


def preprocess(image):
    image = cv2.medianBlur(image, 3)
    image = cv2.GaussianBlur(image, (3, 3), 0)
    return 255 - image


def postprocess(image):
    image = cv2.medianBlur(image, 5)
    # image = cv2.medianBlur(image, 5)
    # kernel = numpy.ones((3,3), numpy.uint8)
    # image = cv2.morphologyEx(image, cv2.MORPH_OPEN, kernel)
    return image


def get_block_index(image_shape, yx, block_size): 
    y = numpy.arange(max(0, yx[0]-block_size), min(image_shape[0], yx[0]+block_size))
    x = numpy.arange(max(0, yx[1]-block_size), min(image_shape[1], yx[1]+block_size))
    return tuple(numpy.meshgrid(y, x))


def adaptive_median_threshold(img_in):
    med = numpy.median(img_in)
    img_out = numpy.zeros_like(img_in)
    img_out[img_in - med < THRESHOLD] = 255
    return img_out


def block_image_process(image, block_size):
    out_image = numpy.zeros_like(image)
    for row in range(0, image.shape[0], block_size):
        for col in range(0, image.shape[1], block_size):
            idx = (row, col)
            block_idx = get_block_index(image.shape, idx, block_size)
            out_image[block_idx] = adaptive_median_threshold(image[block_idx])

    return out_image


def process_image_file(filename):
    image_in = cv2.cvtColor(cv2.imread(filename), cv2.COLOR_BGR2GRAY)

    image_in = preprocess(image_in)
    image_out = block_image_process(image_in, BLOCK_SIZE)
    image_out = postprocess(image_out)

    cv2.imwrite('bin_' + filename, image_out)


if __name__ == "__main__":
    process_image_file(sys.argv[1])
scottt
  • 809
  • 6
  • 8
  • If I run the code with the original file in the OP then I get the error "IndexError: index 336 is out of bounds for axis 0 with size 336". The line "out_image[block_idx]=...." causes the error. – granular bastard May 13 '23 at 16:33
  • It seems that new version of numpy expects the `block_idx` to be a tuple. I've updated the answer to have `get_block_index` return a tuple – scottt May 15 '23 at 15:08
1

OpenCV has a tutorial dealing with removing grid from an image:

"Extract horizontal and vertical lines by using morphological operations", OpenCV documentation, source : https://docs.opencv.org/master/dd/dd7/tutorial_morph_lines_detection.html

coppensg
  • 11
  • 2
  • 5
    Could you include the relevant parts of the linked resource to your answer? As is, your answer is very susceptible to link rot (i.e. if the linked resource changes or disappears, your answer is not helpful). – mech Jan 29 '18 at 00:05
  • The only problem I have is that fraction bar would also be removed (when extracting horizontal lines). – Xai Nano Jan 29 '18 at 13:11
  • 1
    Is there any way to preserve fraction bars? – Xai Nano Jan 29 '18 at 13:34
0

This is a pretty difficult task. I also had this problem and I discovered that the solution can't be 100% accurate. BTW, just a few days ago I saw this link. Maybe it could help.

lucians
  • 2,239
  • 5
  • 36
  • 64