12

This is an interesting topic. Basically, I have an image that contains some text. How do I extract the text from the image?

I have already tried many things, but everything I do is very tedious and usually does not work. I am simply wondering if there is a fairly easy way to do this.

I have come upon this: http://sourceforge.net/projects/javaocr/. I have tried this for hours, but I cannot get it to take an Image and turn it into a String of text from the image.

Thank you all in advance!

Dylan Wheeler
  • 6,928
  • 14
  • 56
  • 80
  • You could also find this helpful: http://stackoverflow.com/questions/9480831/java-ocr-api-open-source-on-eclipse/9481603#9481603 – Nikolay May 03 '12 at 04:42

4 Answers4

7

You need to look into Java OCR implementations. Take a look at this question: Java OCR

Community
  • 1
  • 1
Josh Diehl
  • 2,913
  • 2
  • 31
  • 43
4

Tess4J, a JNA wrapper around Tesseract engine, supports APIs that take BufferedImage, File, or image data as input, and return String as output.

nguyenq
  • 8,212
  • 1
  • 16
  • 16
  • I know I'm commenting after 3 years but your answer shoul be the right ansswer 'javaOCR' has many problems but this API works very well. – SlimenTN Jun 18 '15 at 08:47
2

You need an OCR (optical character recognizer) library or write your own. Check out this SO question.

Community
  • 1
  • 1
Pablo Santa Cruz
  • 176,835
  • 32
  • 241
  • 292
0

Try this character recognition library: http://sourceforge.net/projects/javaocr/

Jonathan
  • 7,349
  • 5
  • 29
  • 35