4

Why is Tesseract OCR engine using a global thresholding technique such as Otsu binarization? Aren't local thresholding techniques (e.g. Sauvola, Niblack, etc.) more effective in leaving out text from images?

πάντα ῥεῖ
  • 1
  • 13
  • 116
  • 190
cjbayron
  • 143
  • 1
  • 4

2 Answers2

1

Tesseract was used in Google book project and AFAIK they run tests for best binarization and Otsu was most universal. If Otsu is not best for your case you can use other binarization algorithm before sending image to tesseract.

user898678
  • 2,994
  • 2
  • 18
  • 17
  • See also tesseract discussion about flexible or better binarization - especially some tests: https://github.com/tesseract-ocr/tesseract/issues/3083#issuecomment-877663240 – user898678 Jul 20 '21 at 05:50
1

Basically, depending on the input image we need to select which thresholding algorithm to use. Tesseract uses Otsu method for thresholding because generally the input to Tesseract for extracting the text is having image homogeneities. Otsu method is efficient as well as good enough for such images.

Global thresholding method is useful and good enough when the background does not show local variation relative to the foreground (target) intensity. While local thresholding is necessary when there is local variation occurring between the intensity difference of background and target.

So, while Tesseract does use Otsu method (global thresholding) for binarization, you could pre-process the image with local thresholding methods to get better output from Tesseract.

vatsal gosar
  • 177
  • 7