OpenCV + Tesseract: improving detecting text (credits) from complex image such as scenary

Question

I want to improve text recognition accuracy in somewhat complex image.

I'm currently using following sample code:

I'm trying to detect text from this image:

https://i.ytimg.com/vi/WFobUoRn6Ek/maxresdefault.jpg

(Note: It is similar kinds of footage I'm trying to handle)

Result:

There're many issues, but the biggest problem to me is small letters can be easily omitted or mis-recognized.

e.g.) 'i' became l
e.g.) 'in' became 'm'
e.g.) 'l' can be gone...

I think the problem is related to the resulted image from erFilter. As this pic shows some small parts are already omitted at this point.

Please let me know if there's a good way to avoid such small parts omission. Possibly some sort of preprocessing to the image?

Note: I already checked following post, but the my target footage is similar to example of 'failure' case in his paper.

if possible, mis-recognition can be reduced by: 1. gather all potential ambiguous things like i=l, O=0, etc. 2. in your current image, spot all those ambiguous things. 3. find from a dictionary and context: try to find the most likely version. For example it is unlikely that the word is "TRALNING" but more likely to be "TRAINING". Yes, that's a lot of work... In special tasks (like ANPR-redetection) it might be enough to just replace all ambiguous things by a single fixed thing. — Micka, Oct 11 '16 at 13:04
You can pre-process your image like in this post : https://stackoverflow.com/questions/47601528/how-to-split-noise-and-text-from-the-image-for-preprocessing-of-ocr — sixela, Dec 15 '17 at 10:16

0 Answers0