How to recognize text for scene images

Question

I am trying to build a system (using C#) that can recognize text for scene images. I see that scene text recognition is a challenging task because of low resolution, complex background, non-uniform lightning or blurring effects...

Any ideas for overcoming this problem would be appreciated.

What have you achieved so far and which challenges are you currently facing? — Shai, Jan 08 '12 at 11:32
Thanks for your reply, I just start from begining and now I don't know what approach I should go with... — vudh, Jan 08 '12 at 11:45
I'm sorry I cannot attach images in my post now. I need some reputations before... — vudh, Jan 11 '12 at 09:30

score 0 · Answer 1 · edited Jun 12 '12 at 13:27

I would like to suggest the following papers for an overview of all the techniques proposed in this field:

Jung,K., Kim, K.I., Jain, A.K., 2004. Text information extraction in images and video: A survey, Pattern Recognition 37(5), 977-997
Jian Liang, David Doermann and Huiping Li. "Camera-Based Analysis of Text and Documents: A Survey." International Journal on Document Analysis and Recognition, 7:2+3, pp. 83 -- 104, July 2005

Although the utmost purpose is to recognize text characters from the scene, how to find the text regions and then extract texts are more difficult than character recognition (OCR) itself.

score 0 · Answer 2 · edited May 23 '17 at 12:29

0

I suggest that you'll begin by checking out some open-source text-recognition libraries. See, for example, this thread.

edited May 23 '17 at 12:29

Community

1
1

answered Jan 09 '12 at 09:08

nojka_kruva

1,454
1
10
23

score 0 · Answer 3 · edited May 23 '17 at 12:12

0

The Stroke Width Transform (SWT) can be used to extract text from natural images.

See this stackoverflow page: Stroke Width Transform (SWT) implementation (Java, C#...)

Here's a helpful video: http://videolectures.net/cvpr2010_epshtein_dtns/

edited May 23 '17 at 12:12

Community

1
1

answered Jan 11 '12 at 04:23

Rethunk

3,976
18
32

Thank you for your help, Rethunk. But now I only focus on text regconition, not text detection as the paper mentioned. The first step of my problem is how to do character segmentation from scene images. I tried using binarization method for it but not helpful in case of character overlap. Any ideas for me in this case? Many thanks. – vudh Jan 11 '12 at 09:27
Without SWT or a similar algorithm, you will have a hard time distinguishing text from the background in most images unless there is very high contrast. Binarization works okay for black text on a white background. Look into local thresholding techniques. To avoid recreating the wide variety of known algorithms, review the algorithms in the textbook on vision by Gonzalez and Woods, and and the survey of OCR techniques in the book Character Recognition Systems by Cheriet, Kharma, Liu, and Suen. There is no short answer to your question if you're trying to develop your own OCR library. – Rethunk Jan 14 '12 at 17:18

How to recognize text for scene images

3 Answers3