I'm trying to sort a list by edit distance using the levenshtein distance.
def suggest(dic, word, distance, maxSugestions=5):
list = []
for i in range(1, 200):
for word1 in sorted(dic):
if distance(word1, word) == i:
list.append(word1)
if len(list) == maxSugestions:
return list
This is my current function that receives a list of string (this list has about 43 thousand strings), a word that I want to compare, a function that returns the edit distance between two strings and an integer of maxSugestions that the list should have.
This is the current distance funtion:
def levDistance(str1, str2):
matrix = [[0 for x in range(len(str2) + 1)] for x in range(len(str1) + 1)]
for i in range(len(str1) + 1):
for j in range(len(str2) + 1):
if i == 0:
matrix[i][j] = j
elif j == 0:
matrix[i][j] = i
elif str1[i-1] == str2[j-1]:
matrix[i][j] = matrix[i-1][j-1]
else:
matrix[i][j] = 1 + min(matrix[i][j-1], matrix[i-1][j], matrix[i-1][j-1])
return matrix[len(str1)][len(str2)]
The current suggest() function works, however I need it to be optimized since it is taking too long and I don't know how I should do that. Any help is grateful. Thank you