I am trying to do OCR on an image file in python using teseract-OCR. My environment is- Python 3.5 Anaconda on Windows Machine.
Here is the code:
from PIL import Image
from pytesseract import image_to_string
out = image_to_string(Image.open('sample.png'))
The error I am getting is :
File "Anaconda3\lib\sitepackages\pytesseract\pytesseract.py", line 167, in image_to_string
return f.read().strip()
File "Anaconda3\lib\encodings\cp1252.py", line 23 in decode
return codecs.charmap_decode(input, self.errors, decoding_table)[0]
UnicodeDecodeError:'charmap' codec can't decode byte 0x81 in position 1583: character maps to <undefined>
I have tried the solution mentioned here The hack is not working
I have tried my code on Mac OS it is working.
I have looked into the pytesseract issues: Here is this an open issue
Thanks