I want to extract similar words from a corpus. The similarity is based on string. Namely, when the string of two words are highly similar, two words extract as similar words. For example, If the corpus contains: Aras, bahro, arasis, adkpo, bah, aras sd, kio.
Similar words:
1- aras, arasis, aras sd
2- bahro, bah
how to solve this problem? Thanks.