2

I am wondering is there a way to get scanned image with text to be converted to readable text by writing code for it? Is that possible?

AAA
  • 3,120
  • 11
  • 53
  • 71
  • Java? PHP? Which one? Both? ...I like [OCRopus](http://code.google.com/p/ocropus/) because it's made by teh goog! – Matt Ball Oct 11 '10 at 03:41
  • @Tony and @matt i want to custom write it to implement for my project. – AAA Oct 11 '10 at 03:44

1 Answers1

5

OCRTools is what I use for .net

For Java, i've used Aspire in the past, it's very good, a little scary though. I've heard a lot about tesseract, you might as well check that out.

If you are getting confused by the answer: What you are looking for is an Optical Character Recognition software's API/SDK. What you've asked in the question points to building an OCR from scratch. That requires understanding Image Processing (mainly object recognition).

lalli
  • 6,083
  • 7
  • 42
  • 55
  • Curious about what's a little scary about Aspire. Are you able to share? – Kevin Day Oct 11 '10 at 06:37
  • Actually I had Tiff images from a batch scan. Reading them itself was very confusing, I had to convert them using another library (ImageMagick or something) and then had to, by hit and trial, convert small parts of the image and remove the noise and all.... But that was version 1.something, now it's version 4... – lalli Oct 12 '10 at 03:41