2

I'm using Tesseract but I don't know whether it neglects any nontext area and targets text only. Do I have to remove any nontext area as a preprocessing step for better output?

rmtheis
  • 5,992
  • 12
  • 61
  • 78
chostDevil
  • 1,041
  • 5
  • 17
  • 24

1 Answers1

2

Tesseract has a pretty good algorithm to detect text, but it will eventually give false-positive matches.

Ideally, you would pre-process the image before submitting it to tesseract. Some time ago I engaged in a similar task, so I suggest you take a look at the following material:

Community
  • 1
  • 1
karlphillip
  • 92,053
  • 36
  • 243
  • 426