What are the algorithms that can be use as a spell checker for Indic scripts

Asked Jun 27 '12 at 16:31

Active Jun 17 '14 at 10:00

Viewed 206 times

I've built Optical character recognition for Sinhala (Language in sri lanka). I've had success to some extent. Now What I need to do is post processing using dictionary data.

What would be the best approach for changing misspelled words into correct words? Can any one give suggestions?

I have the dictionary data files in unicode and also my OCR output also a unicode file. I am doing this using C++. I have tried out string matching algorithms with no success so far. I want to start the most relevant approach to this problem. Can anyone help me please?

Thanks in advance.

edited Jun 28 '12 at 03:33

Kevin Bedell

13,254
10
78
114

asked Jun 27 '12 at 16:31

shadee

1

Maybe you should ask this on [Programmers Stack Exchange](http://programmers.stackexchange.com/) and come back here when you have a specific question about implementation...? – Eitan T Jun 27 '12 at 17:01
This [page by Peter Norvig](http://norvig.com/spell-correct.html) has a very simple implementation for a spell-checker and links to similar efforts in other languages. This [page has](http://stackoverflow.com/questions/698196/detecting-misspelled-words) a more links. – VSOverFlow Jun 28 '12 at 07:35
Thank you very much.. I have looked up the Peter Norving. but it seems complicated for me.. so i am going to follow Levenshtein distance method which is mentioned in the page you have given. – shadee Jun 29 '12 at 17:52

What are the algorithms that can be use as a spell checker for Indic scripts

0 Answers0