0

I want to search a word from image(scanned copy), retrieve values from image, highlight the location. Is there any API or library available for processing images. I am using Swing for displaying images.

jh314
  • 27,144
  • 16
  • 62
  • 82
User123
  • 71
  • 3
  • 14
  • 1
    The search term to use is OCR or [Optical Character Recognition](http://en.wikipedia.org/wiki/Optical_character_recognition). – Jesper May 20 '15 at 12:44
  • At first you will have to process the image using OCR engine to convert it pdf or doc. After that you can search text on it. – Rahul May 20 '15 at 12:44

2 Answers2

1

You need something to convert the pixels into characters. That something is a program that provides OCR.

Keep in mind that any program you use will provide its best approximation of what it thinks the character is. While technology has improved a lot, there are many fonts, sufficient noise, and various other confounding factors that could result in false input (where the character is not what you would have deemed it to be). There are also scenarios where the input cannot be mapped to a character. Write your software defensively to handle both cases, as this should be considered "non validated input".

Edwin Buck
  • 69,361
  • 7
  • 100
  • 138
0

Check out "tesseract". It isn't Java, put available for most platforms open-source, and you can call the command-line program from java via System.exec()

https://code.google.com/p/tesseract-ocr/

given the images in the correct format, it's recognition rate is even better than many commercial OCR software products.

Mirko Klemm
  • 2,048
  • 1
  • 23
  • 22