2

I have a list of words in a text file. What I want is for an input word a list of words that are similar to the input word. So the program should work similar to a spell checker API with only thing that the dictionary is limited to my list of words.

I can write my own code if I get some pointers to Spell Checker algorithm or regular expressions.

vallentin
  • 23,478
  • 6
  • 59
  • 81
Balkrishna Rawool
  • 1,865
  • 3
  • 21
  • 35
  • 1
    You may find this question has some useful tips to get you started http://stackoverflow.com/questions/346757/how-do-spell-checkers-work – user17753 Feb 29 '12 at 20:41

3 Answers3

2

Take a look at Apache Commons Lang StringUtils.getLevenshteinDistance. The Levenshtein algorithm gives the "edit distance" between two words, that is, how similar they are. Their implementation is quite fast - I tested it against another implementation I found online and it was about 1/3 faster if I remember correctly.

Paul
  • 19,704
  • 14
  • 78
  • 96
  • Thanks, this was useful. With some googling I could find the source for StringUtils.getLevenshteinDistance. And it did the trick. – Balkrishna Rawool Mar 01 '12 at 13:14
  • Glad it helped. Please accept the answer you found most helpful by clicking the checkmark next to it. – Paul Mar 01 '12 at 14:30
2

I highly recommend taking a look at Peter Norvig's article on How to Write a Spelling Corrector. It's worth reading. And it doesn't involve too much of a complexity. If you scroll down the page, you can see links to Java implementations. Then, you can customize it to your own needs.

Murat Derya Özen
  • 2,154
  • 8
  • 31
  • 44
  • I looked at couple of Java implementations mentioned at the botom of the page. I tried one of them and it was a bit slow with long strings. Thanks for the pointer though. – Balkrishna Rawool Mar 01 '12 at 13:16
-1

http://en.wikipedia.org/wiki/Levenshtein_distance

user1096188
  • 1,809
  • 12
  • 11