6

I get this error: [WinError 2] The system cannot find the file specified, only when I use pytesser to do OCR. Here is my code snippet.

from PIL import Image
from pytesseract import *
image = Image.open('pranav.jpg')
print (image_to_string(image))****

Otherwise, when I use PIL to change size of image, I do not get this error.

Anand S Kumar
  • 88,551
  • 18
  • 188
  • 176
Pranav V
  • 61
  • 1
  • 1
  • 2

6 Answers6

11

You don't have to edit any pytesseract files. You can declare the path to your Tesseract installation inside your code like so:

import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
Aurora0001
  • 13,139
  • 5
  • 50
  • 53
elevated
  • 111
  • 1
  • 3
4

I got the same error. You have to install tesseract from here: https://code.google.com/p/tesseract-ocr/downloads/detail?name=tesseract-ocr-setup-3.02.02.exe&

Then you have to edit the pytesseract.py file. In my case, this file is located in the folder:

C:\Users\USERNAME\AppData\Roaming\Python34\site-packages\pytesseract\pytesseract.py

Search the following lines (for me it's line 60):

# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY
tesseract_cmd = 'tesseract'

and change it to the location, where your pytesseract.exe is located, in my case the line looks like this:

# CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY
tesseract_cmd = 'c:\\Program Files (x86)\\Tesseract-OCR\\tesseract'

Now your code should work.

Maecky
  • 1,988
  • 2
  • 27
  • 41
0

Add tesseract Path in the Environment Variables.

At least that's how I fixed it.

Snow
  • 1,058
  • 2
  • 19
  • 47
0
  1. You can download tesseract from here: https://github.com/UB-Mannheim/tesseract/wiki

    The latest installers can be downloaded here: tesseract-ocr-setup-3.05.01.exe and tesseract-ocr-setup-4.0.0-alpha.20180109.exe (experimental). There are also older versions available.

  2. edit your pytesseract.py eg. C:\Users\USER\Anaconda3\Lib\site-packages\pytesseract.py

    CHANGE THIS IF TESSERACT IS NOT IN YOUR PATH, OR IS NAMED DIFFERENTLY tesseract_cmd = 'c:\Program Files (x86)\Tesseract-OCR\tesseract'

  3. add the following statement in your code after import pytesseract

    pytesseract.pytesseract.tesseract_cmd = 'c:\Program Files (x86)\Tesseract-OCR\tesseract'

Yili Shih
  • 1
  • 1
0

Set tesseract_cmd, pytesseract.pytesseract.tesseract_cmd, TESSDATA_PREFIX and tessdata_dir_config as follows:

from PIL import Image
import pytesseract
tesseract_cmd = 'D:\\Softwares\\Tesseract-OCR\\tesseract'
pytesseract.pytesseract.tesseract_cmd = 'D:\\Softwares\\Tesseract-OCR\\tesseract'
TESSDATA_PREFIX= 'D:\Softwares\Tesseract-OCR'
tessdata_dir_config = '--tessdata-dir "D:\\Softwares\\Tesseract-OCR\\tessdata"'
print(pytesseract.image_to_string( Image.open('D:\\ImageProcessing\\f2.jpg'), lang='eng', config=tessdata_dir_config))
Sourabh Potnis
  • 1,431
  • 1
  • 17
  • 26
0

To completely get rid of the error, please follow these tasks:

  1. Download tesseract (32 bit|64 bit)
  2. Install the same in your system and take note of the path.
  3. Create an environment variable {tesseract = "path of installation/tesseract.exe"}
  4. Restart the kernel
  5. Use the following code:
import pytesseract

pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/ tesseract.exe'

from PIL import Image
value=Image.open("C://Profile_tess.png")

text = pytesseract.image_to_string(value)    
print("text present in images:",text)
Das_Geek
  • 2,775
  • 7
  • 20
  • 26