Hello! I'm trying to see which words are the most similar by measuring common letters.
Example: Which word is the most similar to 'frogs'? It could be 'forks', as it shares 4 letters with 'frogs'. At the moment, I'm simply looping through a .txt file with 5 letter words (wordle) and appending them to an array with a 'score' attached. (score = amount of common letters)
array = []
def loadArray(input):
for elem in open('file.txt', 'r'):
score = 0
for char in list(input):
if char in list(elem):
score += 1
array.append(f'{elem}_{score}')
loadArray('(word that you want)')
# Array might be: [Ducks_4, Crabs_2, Chest_5]
After this is done, I sort the array using this code:
def lastChar(input):
return list(input)[-1]
array.sort(key=lastChar, reverse = True)
Then I just take the first element from that array, it being the one with the highest score. When I run this code with 'frogs', I get 'afros' with a score of 4. (I actually get 'frogs', with a score of 5, but that doesn't count!) Again, this is absolutely not user friendly as it loops through 10,000+ words each time.
Note: This is a simplified form (only takes a blink to run), what I'm doing in my project take ~5-10 seconds. (Python is super slow)
Is there a way to make this more efficient? Or is there a different way all together that's better?
Thanks in advance - Wastive