Improving tesseract ocr result in french

Asked May 19 '19 at 19:48

Active May 19 '19 at 20:48

Viewed 878 times

I want to perform OCR on a image that is fairly clean and "easy" for OCR I think:

But the result using tesseract is quite bad:

print(pytesseract.image_to_string(Image.open('file-2.jpg'),lang='fra'))

Maintenant ie La QT vieux, lorsque
je parcours un cimetière, j'ai
l'impression de Dares CT
LT TTC

Why is that? Can I improve the result?

When I use an online OCR tool the result is perfect.

asked May 19 '19 at 19:48

Sulli

You should invert the image to dark text on light background, and converting the image to gray scale or 2-bit B/W might help further. The more contrast the better. – user3169 May 20 '19 at 05:18
"When I use an online OCR tool the result is perfect." OK, but that's not really helpful unless you know what the OCR tool is doing. – user3169 May 20 '19 at 05:22

Improving tesseract ocr result in french

0 Answers0