Questions tagged [pytesser]

PyTesser is an Optical Character Recognition module for Python. It takes as input an image or image file and outputs a string.

PyTesser is an Optical Character Recognition module for Python. It takes as input an image or image file and outputs a string.

PyTesser uses the Tesseract OCR engine, converting images to an accepted format and calling the Tesseract executable as an external script. A Windows executable is provided along with the Python scripts. The scripts should work in other operating systems as well.

http://code.google.com/p/pytesser/

105 questions
14
votes
7 answers

OSError: [Errno 2] No such file or directory using pytesser

This is my problem, I want to use pytesser to get a picture's contents. My operating system is Mac OS 10.11, and I have already installed PIL, pytesser, tesseract-ocr engine, and other supporting libraries like libpng and so on. But when I run my…
grant
  • 141
  • 1
  • 1
  • 4
13
votes
1 answer

Increase Accuracy of text recognition through pytesseract & PIL

So I am trying to extract text from image. And as the quality and size of image is not good, it is giving inaccurate results. I tried few enhancements and other things with PIL but that is only worsening the quality of image. Can someone suggest…
sprksh
  • 2,204
  • 2
  • 26
  • 43
12
votes
4 answers

how to get character position in pytesseract

I am trying to get character position of image files using pytesseract library . import pytesseract from PIL import Image print pytesseract.image_to_string(Image.open('5.png')) Is there any library for getting each position of character
9
votes
5 answers

Highly inconsistent OCR result for tesseract

This is the original screenshot and I cropped the image into 4 parts and cleared the background of the image to the extent that I can possibly do but tesseract only detects the last column here and ignores the rest. The output from the tesseract…
codefreaK
  • 3,584
  • 5
  • 34
  • 65
9
votes
6 answers

Python : OSError: [Errno 2] No such file or directory

I am using pytesseract lib to extract text from image. This works fine when I am running code on localhost. But gives me above error when I deploy on openshift. Below is code what I have written so far. try: import Image except ImportError: from…
Suraj Palwe
  • 2,080
  • 3
  • 26
  • 42
8
votes
3 answers

Getting an error when using the image_to_osd method with pytesseract

Here's my code: import pytesseract import cv2 from PIL import Image pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe" def main(): original = cv2.imread('D_Testing.png', 0) # binary thresh it at…
Bob Stoops
  • 151
  • 2
  • 12
8
votes
5 answers

Image to text python

I am using python 3.x and using the following code to convert image into text: from PIL import Image from pytesseract import image_to_string image = Image.open('image.png', mode='r') print(image_to_string(image)) I am getting the following…
muazfaiz
  • 4,611
  • 14
  • 50
  • 88
7
votes
2 answers

Extract text from image using OCR in python

I want to extract text from a specific area of the image like the name and ID number from identity card. The ID card from which I want to extract text is in the Chinese language(Chinese ID card). I have tried this code but it just extracts the…
Tehseen
  • 115
  • 2
  • 14
7
votes
2 answers

Reading text from image

Any suggestions on converting these images to text? I'm using pytesseract and it's working wonderfully in most cases except this. Ideally I'd read these numbers exactly. Worst case I can just try to use PIL to determine if the number to the left…
LampShade
  • 2,675
  • 5
  • 30
  • 60
7
votes
1 answer

Importing pytesseract

I have trying to use pytesseract for OCR (extracting text from the image). I have successfully installed pytessearct by using the command - pip install pytessearct When I try to install it again, it clearly says - Requirement already satisfied…
ComplexData
  • 1,091
  • 4
  • 19
  • 36
7
votes
5 answers

leptonica/allheaders.h file not found (gcc error) on install of tesseract-ocr

I am trying to run the following code on my mac. import Image import pytesseract im = Image.open('test.png') print(pytesseract.image_to_string(im)) Following the question from here: pytesseract-no such file or directory error I need to install…
Jase Villam
  • 2,895
  • 6
  • 18
  • 21
6
votes
6 answers

Error using Pytesser :**[WinError 2] The system cannot find the file specified**

I get this error: [WinError 2] The system cannot find the file specified, only when I use pytesser to do OCR. Here is my code snippet. from PIL import Image from pytesseract import * image = Image.open('pranav.jpg') print…
Pranav V
  • 61
  • 1
  • 1
  • 2
5
votes
2 answers

pytesseract Output is not defined

Trying to run tesseract on python, this is my code: import cv2 import os import numpy as np import matplotlib.pyplot as plt import pytesseract import Image # def main(): jpgCounter = 0 for root, dirs, files in…
mbc
  • 91
  • 3
  • 11
5
votes
1 answer

Using multiple languages in Pytesser

I have started to use Pytesser, which works great with both english and chinese, but is there a way to have both languages work at the same time? Would I have to make my own traineddata file? My code is: import Image from pytesser import * print…
Dave Lin
  • 68
  • 1
  • 1
  • 8
4
votes
1 answer

Pytesseract Improve OCR Accuracy

I want to extract the text from an image in python. In order to do that, I have chosen pytesseract. When I tried extracting the text from the image, the results weren't satisfactory. I also went through this and implemented all the techniques listed…
Sushil
  • 5,440
  • 1
  • 8
  • 26
1
2 3 4 5 6 7