4

Is there a way to get the trust rate of an OCR output that is produced by Pytesseract ? What I mean by the trust rate is the correctness percentage of the OCR output.

Example:

text = pytesseract.image_to_string(editedImage) 

For this text string I also want to show the trust rate if it is possible.

Edit: I tried the image_to_data but I got an error

print(pytesseract.image_to_data(Image.open('test.png')))



Traceback (most recent call last):
  File "/usr/lib/python3.4/tkinter/__init__.py", line 1536, in __call__
    return self.func(*args)
  File "/home/caner/Desktop/Met/OCR-METv3/venv/tkgui.py", line 192, in convert
    print(pytesseract.image_to_data(Image.open('test.png')))
  File "/home/caner/Desktop/Met/OCR-METv3/venv/lib/python3.4/site-packages/pytesseract/pytesseract.py", line 232, in image_to_data
    return run_and_get_output(image, 'tsv', lang, config, nice)
  File "/home/caner/Desktop/Met/OCR-METv3/venv/lib/python3.4/site-packages/pytesseract/pytesseract.py", line 142, in run_and_get_output
    with open(filename, 'rb') as output_file:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tess_2mxczh8n_out.tsv' 

1 Answers1

4

My guess is that you're referring to confidence with trust rate. There is some info regarding this on the repo of the pytesseract module here.

Functions

  • image_to_string Returns the result of a Tesseract OCR run on the image to string
  • image_to_boxes Returns result containing recognized characters and their box boundaries
  • image_to_data Returns result containing box boundaries, confidences, and other information. Requires Tesseract 3.05+. For more information, please check the Tesseract TSV documentation

I think what you're looking for is the image_to_data function.

neznidalibor
  • 175
  • 8
  • I have tired data=pytesseract.image_to_data(editedImage,lang="tur", config='', nice=0, output_type=pytesseract.Output.STRING but I faced with a problem FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tess_3z2uj7ij_out.tsv' – caner karagüler Feb 18 '18 at 11:22
  • Have you tried this maybe: [link](https://stackoverflow.com/questions/28741563/pytesseract-no-such-file-or-directory-error) ? It could be just a missing library. – neznidalibor Feb 18 '18 at 15:26
  • I tried thah solution but not worked for me. I edited my question and the whole error message is written there. @neznidalibor – caner karagüler Feb 18 '18 at 17:22
  • Hmmm, that's strange... Have you tried adding it to your PATH variable, as mentioned [here](https://stackoverflow.com/questions/36625207/python-oserror-errno-2-no-such-file-or-directory?noredirect=1&lq=1) ? – neznidalibor Feb 18 '18 at 20:34
  • Yes I tried it but not worked again. The problem is with tsv files but I do not know where can I find them and add to the project folder @neznidalibor – caner karagüler Feb 19 '18 at 13:46