4

I recently installed Pytesseract and to make sure it works I ran the following test/code:

from PIL import Image
from pytesseract import image_to_string

print(image_to_string(Image.open('test.tiff')))

I see Tesseract load up from CMD and after its done doing its thing it closes down. Afterwards, the Python shell prints out the contents of 'test.tiff'. Great it works...or so I thought. The issue I have is when I try to run the test again for another tiff file 'test2.tiff' I get the following error:

Traceback (most recent call last):
  File "C:\Users\Freeware Sys\Desktop\OTF.py", line 22, in <module>
    print(image_to_string(Image.open('test2.tiff')))
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pytesseract\pytesseract.py", line 193, in image_to_string
    return run_and_get_output(image, 'txt', lang, config, nice)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pytesseract\pytesseract.py", line 130, in run_and_get_output
    temp_name, img_extension = save_image(image)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\pytesseract\pytesseract.py", line 86, in save_image
    image.save(input_file_name, format=img_extension, **image.info)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\PIL\Image.py", line 1935, in save
    save_handler(self, fp, filename)
  File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\PIL\TiffImagePlugin.py", line 1535, in _save
    raise IOError("encoder error %d when writing image file" % s)
OSError: encoder error -2 when writing image file

That's weird. So I try adding the extra pytesseract quickstart code because maybe pytessseract isn't calling tesseract.

from PIL import Image
from pytesseract import image_to_string

pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'

print(image_to_string(Image.open('test2.tiff')))

Still doesn't work. Funny thing is, if I run tesseract directly from CMD and push 'test2.tiff' through it - it does work. Anyone know what is going on?

reimagepy
  • 41
  • 2

1 Answers1

0

User convert method while opening image.

    from PIL import Image
    from pytesseract import image_to_string

    image = Image.open('Book.tif').convert("RGBA")
    text = image_to_string(image,lang='eng')
    print(text)

Reference from [https://stackoverflow.com/a/52115274/3728540].

shiv
  • 625
  • 1
  • 6
  • 12