0

I have an image that is converted to a numpy array

np_image = np.array(Image.open(filename))

I am attempting OCR using pytesseract, but the ocr fails when the text is red or yellow and so I want all text to be black.

I am breaking down the image into snippets as I know where the individual text elements appear

As you can see from the code below, I have attempted to find coloured pixels and and convert them to black, but it does not work

snippet = np_image[top: bottom, field.left: field.right].copy()
for row in snippet:
    for pixel in row:
        test = [rgb for rgb in pixel[:-1]]
        test.sort()
        if test[0] > 190 and test[2] < 100:
            pixel = [0, 0, 0, 255]

text = pytesseract.image_to_string(snippet)

What should I do?

Psionman
  • 3,084
  • 1
  • 32
  • 65

1 Answers1

1

pixel = [0, 0, 0, 255] does not write in the actual image. It just set the value [0, 0, 0, 255] to the local variable pixel. Assuming np_image is a Numpy array, you certainly need to write pixel[:] = [0, 0, 0, 255] so that np_image can be modified.

Moreover, the sort function sorts the values in an increasing order. Thus, the condition seems suspiciously wrong. Indeed, test[0] <= test[2] must always be true. Thus, if test[0] > 190 is true, then test[2] < 100 cannot be true. As a result, the condition should never be true.

Jérôme Richard
  • 41,678
  • 6
  • 29
  • 59
  • For anyone looking for an answer to a similar question, I now realise that this is the WRONG approach to improving accuracy in pytesseract OCR. [See this SO question and answers](https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy) – Psionman Jan 10 '22 at 12:34