I have a list of short hand text. All in English Language. Is there a Machine Learning algorithm that can be used to expand these abbreviations? For example, if the short hand is 'txt', it could suggest 'text', 'context', 'textual', etc with varying penalty values.
In addition, when I make a choice on the right word, I want it to learn this such that when next I input same shorthand, my choice get's high ratings.
Edit
Specifically, I have tried using this Language model described here but it only works for edits up to two levels. The 'edit' function is below:
def edits1(word):
"All edits that are one edit away from `word`."
letters = 'abcdefghijklmnopqrstuvwxyz'
splits = [(word[:i], word[i:]) for i in range(len(word) + 1)]
deletes = [L + R[1:] for L, R in splits if R]
transposes = [L + R[1] + R[0] + R[2:] for L, R in splits if len(R)>1]
replaces = [L + c + R[1:] for L, R in splits if R for c in letters]
inserts = [L + c + R for L, R in splits for c in letters]
return set(deletes + transposes + replaces + inserts)
It basically starts with one letter and then deletes, transposes, replaces and inserts letters (using letters of the alphabet).
How do I extend this to more than two edits?