1

I need some lib that will perfom character recognition from cyrillic letters. I hava only one idea to map letter from cyrillic to latin but it's bad quality. Could someone tell me is there is any lib. Or otherwise any solution of this problem?

Thanks in advance.

Alex Abdugafarov
  • 6,112
  • 7
  • 35
  • 59
Oleksandr
  • 2,346
  • 4
  • 22
  • 34

2 Answers2

4

As far as i know there are no native opensource Java OCR SDKs. There are Java APIs which wrap calls for native interfaces, for example, for one of the most popular opensource OCR engines - Tesseract (http://groups.google.com/group/tesseract-ocr/) - there are some Java wrappers like tesjeract (http://code.google.com/p/tesjeract/) or Tess4J (http://tess4j.sf.net/). That could work for you, but it's rather hard to set up and will require developing image-preprocessing and font training on your side.

One more solution could be a cloud service. It requires end-user application to have the internet connection, but it's independent from your programming language choice and resources limitations. Have a look at ABBYY Cloud OCR SDK, it's a cloud-based OCR SDK recently launched by ABBYY. It's in beta, so for now it's totally free to use and it has a ready-to-go Java code samples.

Nikolay
  • 2,206
  • 3
  • 20
  • 25
  • 1
    To make picture complete I would mention also Asprise, it is only native Java OCR, but it is not open source and does not support Cyrillic. In fact, I never heard anything good about its quality (only this: http://stackoverflow.com/a/3731291/137353 ), and haven't seen it being menitoned in any OCR accuracy comparisons. – Tomato Jan 10 '12 at 16:17
2

Though it is not in Java, when it comes to OCR I'd suggest the open source Ocropus system http://code.google.com/p/ocropus/

Also, this thread discusses Java OCR solutions Java OCR implementation

Also, if you just want some ad hoc solution you could try Google Docs OCR http://googlesystem.blogspot.com/2009/09/google-docs-ocr.html

Community
  • 1
  • 1
bpgergo
  • 15,669
  • 5
  • 44
  • 68