Get character level Confidance in tesseract

Question

I am using Pytesseract for OCR. But it looks like there is no option in the documentation to extract the confidence of ever character. I already have the Confidence of word but I want to know at which character the confidence is getting low.

So after research I came to know there is a function tesserractExtractResult() in the tesseract API which can give confidence of characters.

How can I use this function in Python?

Similar issue [here](https://stackoverflow.com/questions/48162645/how-to-get-character-wise-confidence-in-tesseract-using-command-line?rq=1) - also no answers. It seems to require source code modification as suggested [here](https://stackoverflow.com/questions/17393555/character-confidence-for-tesseract-3-02-using-config-file) for an older version. — FObersteiner, Aug 26 '19 at 16:30
I added an answer for this (but tesseract not pytesseract) - see https://stackoverflow.com/questions/48162645/how-to-get-character-wise-confidence-in-tesseract-using-command-line?rq=1 — jtlz2, Sep 03 '19 at 07:26
Would you accept a tesseract answer or must it be pytesseract? — jtlz2, Sep 03 '19 at 08:00

score 1 · Answer 1 · answered Aug 28 '19 at 22:07

Pytesseract calls Tesseract in the background as if launched in a terminal (here in the source code), so you have at your disposition only what the shell command can do - and as far I know, you can't get character confidence.

I think that pyocr should be able to do so, but it is needed to add the function call (maybe in tesseract_raw.py? ).

Also, more as a note: it seems that python-tesseract and pytess have at least some line in code referring to tesseractExtractResult, but last commits were respectively in 2015 and 2012.

Thanks for the Help. I was able to get it using tesserocr, But it seems all the characters were having a confidence of 98(approx), while the word confidene was 32. So not much of help here — user3809411, Aug 29 '19 at 08:05

Get character level Confidance in tesseract

1 Answers1