7

I have trying to use pytesseract for OCR (extracting text from the image). I have successfully installed pytessearct by using the command -

pip install pytessearct

When I try to install it again, it clearly says -

Requirement already satisfied (use --upgrade to upgrade): 
pytesseract in ./site-packages

This means pytessearct is installed successfully. When i try to import this package in my iPython notebook using -

import pytessearct

It throws an error -

ImportError: No module named pytesseract

Why is that happening?

Thomas K
  • 39,200
  • 7
  • 84
  • 86
ComplexData
  • 1,091
  • 4
  • 19
  • 36

1 Answers1

8

To use Python-tesseract - requires python 2.5+ or python 3.x - first you have to install PIL and pytesseract packages through pip:

pip install Pillow
pip install pytesseract

Then you have to download and install the tesseract OCR:

https://sourceforge.net/projects/tesseract-ocr-alt/?source=typ_redirect

As far as I know it automatically adds it to your PATH variable.

Then use it like this way:

import pytesseract
from PIL import Image

img = Image.open('Capture.PNG')
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files (x86)\\Tesseract-OCR\\tesseract.exe'
print( pytesseract.image_to_string(img) )

I hope it helps :)

ajlaj25
  • 227
  • 2
  • 7