4

I'm trying to create a piece of software that automate the PC by capturing the screenshot, then OCR (Optical Character Recognition) it looking for a particular button to click (for example). I've got the mouse and keyboard control part, but now, I needed an OCR to process the screenshot. What I discovered is that Tesseract OCR does not seems to work very well with on-screen text. The text is either too small, or that some of text seems to be connected, like for example K and X. How should I go about this?

p/s: this is for an automated test program.

Hao Wooi Lim
  • 3,928
  • 4
  • 29
  • 35
  • Could you just bump up the text size and tweak the font on the test machine? – Tom Ritter May 22 '09 at 03:23
  • What exactly do you want to test? If it's a simple test program you can query Windows using SendMessage and GetWndText to search for the buttons and controls you like. Why go to the hussle of an OCR? – Paulo Santos May 22 '09 at 03:23
  • I can bump up the text size, but some of the font inside the application can't be bumped up without modifying code. – Hao Wooi Lim May 22 '09 at 03:27

2 Answers2

0

I am not sure if this really fits the bill for you, but some of the better OCR that I have seen in automation is done by Tevron's CitraTest. It has a library of fonts included and if a fontset is not present, they will create a new one based on your submissions. Nagative factors with this tool would be cost and the usual issues related to variable screen resolution.

Steven
  • 3,813
  • 2
  • 22
  • 27
0

Perhaps look at this question on image enhancement prior to OCR. Otherwise this question is pretty similar to "OCR for .NET".

If you are feeling really bold you can always whip up a simple Perceptron or Neural Network based approach :-)

Community
  • 1
  • 1
Matt Mitchell
  • 40,943
  • 35
  • 118
  • 185