how do i extract numbers from an image, row by row?

Question

after preprocessing an image of a sudoku board (from web) with opencv, I managed to get the following picture:

looping through the contours and extracting each value using pytesseract and psm 10 (single character) resulted in junk values.

thus i would like to slice the image to rows and try to extract the values using the config psm 6, hoping it might work.

The approach i took is the simply numpy-slicing the row and trying to extract the values, although it doesn't work, giving me SystemError: tile cannot extend outside image after the first iteration although im sure the slicing occur inside the image

y = 1
for x in range(1, 9):
     cropped_row = mask[y*33-33:y*33-1][x*33-33:x*33-1]
     text = tess.image_to_string(np.array(cropped_row), config='--psm 6')
     y += 1
     print(text)

i would like some guidance to the ecorrect aproach in OCRing rows from the image

I'm surprised you didn't fare better with `psm=10`. Did you include a border and set the dpi sensibly? — Mark Setchell, Sep 26 '20 at 12:17
if mask is numpy is should be mask[y*33-33:y*33-1, x*33-33:x*33-1], you can also loop over x and y at the same time for x,y in zip(range(1,9),range(1,9)) — Maciek Woźniak, Sep 26 '20 at 14:17

score 0 · Answer 1 · edited Sep 26 '20 at 19:04

I have tried this:

custom_oem_psm_config = r'--oem 3 --psm 6 -c tessedit_char_whitelist="0123456789"'# -c preserve_interword_spaces=0'
text= pytesseract.pytesseract.image_to_string(otsu, config=custom_oem_psm_config)
print(text)

Output:

If you want to get the exact positions of the numbers, try numpy slicing and sort them from left to right and top to bottom, then pass each number to tesseract.

score 0 · Answer 2 · answered Sep 26 '20 at 17:00

in the end i took a slightly different approach as explained by natancy in this answer.

I focused on the grid lines, and removed all values so that findcontours() will locate all grid cells.

then, i looped through all contours and checked if they're a cell (sizewise) or some other contour. if it is a cell, a mask made only the current cell visible (and its values when used bitwise_and(original_image, mask) that way i could get a blank image with only a single number, and i ran that image through tesseract. some text clearing later i got my desired output.

extraction of numbers:

list_of_clues = []
    for contour in contours:
        extracted_value = ''

        # create black mask
        mask = np.zeros(processed.shape, dtype=np.uint8)

        # check if contour is a cell
        area = cv2.contourArea(contour)
        if 700 <= area <= 1000:  # contour is a cell
            cv2.drawContours(mask, [contour], -1, WHITE, -1)  # color everything in mask, but the contour- white
            isolated_cell = cv2.bitwise_and(processed, mask)
            isolated_cell[mask == 0] = 255  # invert isolated_cell's mask to WHITE (for tess)

            # extract text from isolated_cell
            text = tess.image_to_string(isolated_cell, config='--psm 10')

            # clean non-numbers:
            for ch in text:
                if ch.isdigit():
                    extracted_value = ch

            # calculate cell coordinates only if extracted_value exist
            if extracted_value:
            # relevant for my proj, extract grid coordinates of extracted value
                [x_pos, y_pos, wid, hei] = cv2.boundingRect(contour)    # get contour's sizes
                x_coord = int(x_pos // (grid_size_pixels / 9))          # get x row-coordinate
                y_coord = int(y_pos // (grid_size_pixels / 9))          # get y col-coordinate
                list_of_clues.append(((x_coord, y_coord), int(extracted_value)))
        else:   # contour isn't a cell
            continue

how do i extract numbers from an image, row by row?

2 Answers2