3

I want to the detect the text region inside the image so that I can send that region to Tesseract for OCR. Currently I'm sending the whole image to Tesseract but I'm not getting accurate result. But if I send only that region which contains text then it gives me a good result. I'm doing Tesseract in Android

Can Anyone tell me how can I do it ?

Thanks in Advance !!!

Ajinkya S
  • 570
  • 7
  • 17
  • Well..it is not easy..You might have to use some image processing libraries like `opencv` to achieve it. How about giving an option to user where in which user can `crop` the text area in the image himself? – Abhishek V Nov 28 '13 at 05:27
  • 2
    I've actually had some good luck detecting text in images using contrast differences, dilating the image, counting disconnected blobs in an area, and then segmenting. I made a blog about it [here](http://tommd.github.io/posts/LprWithCV.html). – Thomas M. DuBuisson Nov 28 '13 at 06:29
  • Thanks @AbhishekV for your answer, But i want to crop it through program so that user's task get reduced. And the problem with opencv is that your device needs opencv manager already installed or you may have to install it before using opencv app as it provides libraries to the app. Can it be possible to integrate the opencv manager in our app so that the user don't require to install opencv manager externally ? – Ajinkya S Nov 28 '13 at 06:29
  • @AjinkyaS I haven't worked with the `openCV for android` yet. It's better if you ask a new question regarding `opencv manager` so that you may get solution from the experts in that field. – Abhishek V Nov 28 '13 at 06:55

1 Answers1

0

If the Android version mirrors the original, then you can use Tesseract API methods SetRectangle or TesseractRect to define or recognize a rectangle from an image.

http://code.google.com/p/tesseract-ocr/wiki/APIExample

nguyenq
  • 8,212
  • 1
  • 16
  • 16
  • Thanks @nguyenq I tried it, But in that i have to mention the attributes like left,top,width,height for getting the rectangle. I want to detect text from a number plate of vehicle, I have done threshold on that image and getting the number plate highlighted Now according to Tesseract's methods as u suggested how can I detect that rectangle ? Have a look at the sample image as well as its threshold Image. Original Image http://i.stack.imgur.com/iLnaw.png Threshold Image http://i.stack.imgur.com/FeWyf.png – Ajinkya S Nov 29 '13 at 05:37
  • 1
    Sorry, I misunderstood your question. The text region detection feature is outside of Tesseract domain. Try opencv as suggested in http://stackoverflow.com/questions/11464397/image-preprocessing-for-text-recognition or [Scene Text Detection](http://docs.opencv.org/trunk/modules/objdetect/doc/erfilter.html?highlight=text). – nguyenq Nov 29 '13 at 17:45