I've built Optical character recognition for Sinhala (Language in sri lanka). I've had success to some extent. Now What I need to do is post processing using dictionary data.
What would be the best approach for changing misspelled words into correct words? Can any one give suggestions?
I have the dictionary data files in unicode and also my OCR output also a unicode file. I am doing this using C++. I have tried out string matching algorithms with no success so far. I want to start the most relevant approach to this problem. Can anyone help me please?
Thanks in advance.