4

I've built Optical character recognition for Sinhala (Language in sri lanka). I've had success to some extent. Now What I need to do is post processing using dictionary data.

What would be the best approach for changing misspelled words into correct words? Can any one give suggestions?

I have the dictionary data files in unicode and also my OCR output also a unicode file. I am doing this using C++. I have tried out string matching algorithms with no success so far. I want to start the most relevant approach to this problem. Can anyone help me please?

Thanks in advance.

Kevin Bedell
  • 13,254
  • 10
  • 78
  • 114
shadee
  • 160
  • 3
  • 11
  • 1
    Maybe you should ask this on [Programmers Stack Exchange](http://programmers.stackexchange.com/) and come back here when you have a specific question about implementation...? – Eitan T Jun 27 '12 at 17:01
  • This [page by Peter Norvig](http://norvig.com/spell-correct.html) has a very simple implementation for a spell-checker and links to similar efforts in other languages. This [page has](http://stackoverflow.com/questions/698196/detecting-misspelled-words) a more links. – VSOverFlow Jun 28 '12 at 07:35
  • Thank you very much.. I have looked up the Peter Norving. but it seems complicated for me.. so i am going to follow Levenshtein distance method which is mentioned in the page you have given. – shadee Jun 29 '12 at 17:52

0 Answers0