12

I have installed tesseract in Google colab using the command

!pip install tesseract

But when I run the command

text = pytesseract.image_to_string(Image.open('cropped_img.png'))

I get the below error:

TesseractNotFoundError: tesseract is not installed or it's not in your path

Krishna
  • 6,107
  • 2
  • 40
  • 43
Prosenjit
  • 145
  • 1
  • 2
  • 10
  • Possible duplicate of [Tesseract Not Found Error](https://stackoverflow.com/questions/50655738/tesseract-not-found-error) – Antwane Nov 05 '18 at 12:41

6 Answers6

16

Add pytesseract.pytesseract.tesseract_cmd = r'/usr/local/bin/pytesseract'

This should solve the TesseractNotFoundError.

Standerwahre
  • 176
  • 1
  • 3
11

There could be a number of reasons for this, but normally it is because you do not have the C library available for tesseract. Even though pytesseract is required, it is only half of the solution.

You essentially need to install both the tesseract package for linux, along with the Python binding.

This would essentially be the solution:

! apt install tesseract-ocr
! apt install libtesseract-dev

The above installs the required dependencies for pytesseract. This is very important, especially the ! without which you cannot install directly to the underlying operating system.

The remainder of the process is relatively simple:

! pip install Pillow
! pip install pytesseract

This installs the Python binding.

The remainder is fairly simple and all you need to do is import!

import pytesseract
from PIL import ImageEnhance, ImageFilter, Image

Then you can let the magic happen.

Hopefully this helps someone.

Srivats Shankar
  • 2,364
  • 1
  • 15
  • 19
6

You have to install the tesseract engine first, before using the pytesseract wrapper. You can install the engine on Google colab using:

!sudo apt install tesseract-ocr

You can find a sample at:

https://github.com/labdeeman7/document-ocr/blob/master/classification%20via%20NLP%20and%20information%20extraction.ipynb

Payam Khaninejad
  • 7,692
  • 6
  • 45
  • 55
alabi tosin
  • 71
  • 1
  • 1
1

You'll need to install pytesseract rather than tesseract.

Here's an example:

https://colab.research.google.com/drive/1zduW1Hxv7Z_pwMFGjVauhs1dTlvZByCy

Bob Smith
  • 36,107
  • 11
  • 98
  • 91
  • 1
    I have installed pytesseract with the command "!pip install pytesseract" ,still getting the same error. Can you please provide an example of reading image with pytesseract in the above notebook. – Prosenjit Aug 06 '18 at 18:53
0
!sudo apt install tesseract-ocr

!pip install pytesseract

Run these two commands in your colab cell before using tesseract. It worked for me.

Parmesh
  • 34
  • 3
0

At first run this code in a cell :

!pip3 install pytesseract

After that RESTART RUNTIME then run this code in another cell:

!apt install tesseract-ocr

It worked for me.

Esraa Abdelmaksoud
  • 1,307
  • 12
  • 25