I am currently trying to improve an OCR routine. The text I encounter is white with a varying background. So I'm thinking of changing the perfect white text to black, and everything else to white. Everything works fine, till I need to invert the colours.
The invert
method from PIL
doesn't support this image mode, so I have to convert, but I get bad results from it.
OSError: not supported for this image mode
My test image is this:
Which I can turn into:
But when I try to convert, invert and convert back, it gets the colours/grayscale again?
So, currently, I can't find a way to get the result I want:
If I use the white text to read the image, I only get "Lampent used BS] gL [LL =e". But it reads perfectly fine with Black text.
What is another way I can invert my image? The only other stuff I found, wants to change every pixel at a time, with no good guidance for beginner coder.
def readimg(image, write=False):
import pytesseract
from PIL import Image
# opening an image from the source path
if isinstance(image, str):
img = Image.open(image)
img = img.convert('RGBA')
else:
img = image
img = img.convert('RGB') # Worse results if not reconverted??
img.show()
# path where the tesseract module is installed
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'
# converts the image to result and saves it into result variable
result = pytesseract.image_to_string(img)
# write text in a text file and save it to source path
if "" in result: # Catch some Garbage text
result = result[:-2]
result = result.strip() # Clean newlines
if write:
with open('output.txt', mode='a') as file:
file.write(result)
print(result)
return result
def improve_img(image):
from PIL import Image, ImageOps
if isinstance(image, str): # if link, open it
im = Image.open(image)
else:
im = image
im = im.convert('RGBA')
thresh = 254 # https://stackoverflow.com/questions/9506841/using-python-pil-to-turn-a-rgb-image-into-a-pure-black-and-white-image
fn = lambda x: 255 if x > thresh else 0
r = im.convert('L').point(fn, mode='1')
#r = im.convert('RGB')
#r = ImageOps.invert(r)
#r = im.convert('L')
#r.save("test.png")
#r.show()
return r
if __name__ == '__main__':
test = improve_img('img/testtext1.png')
readimg(test)