How would I make looping through elements more performance friendly? (Python)

Question

Hello! I'm trying to see which words are the most similar by measuring common letters.

Example: Which word is the most similar to 'frogs'? It could be 'forks', as it shares 4 letters with 'frogs'. At the moment, I'm simply looping through a .txt file with 5 letter words (wordle) and appending them to an array with a 'score' attached. (score = amount of common letters)

array = []
def loadArray(input):
  for elem in open('file.txt', 'r'):
    score = 0
    for char in list(input):
      if char in list(elem):
        score += 1
    array.append(f'{elem}_{score}')
loadArray('(word that you want)')
# Array might be: [Ducks_4, Crabs_2, Chest_5]

After this is done, I sort the array using this code:

def lastChar(input):
  return list(input)[-1]
array.sort(key=lastChar, reverse = True)

Then I just take the first element from that array, it being the one with the highest score. When I run this code with 'frogs', I get 'afros' with a score of 4. (I actually get 'frogs', with a score of 5, but that doesn't count!) Again, this is absolutely not user friendly as it loops through 10,000+ words each time.

Note: This is a simplified form (only takes a blink to run), what I'm doing in my project take ~5-10 seconds. (Python is super slow)

Is there a way to make this more efficient? Or is there a different way all together that's better?

Thanks in advance - Wastive

Instead of rolling your own, considering using `Levenshtein.ratio` or `difflib.SequenceMatcher`. https://stackoverflow.com/questions/6690739/high-performance-fuzzy-string-comparison-in-python-use-levenshtein-or-difflib — GordonAitchJay, Mar 16 '22 at 06:47
`array.append((elem, score))`. Or use a dictionary: `d[elem] = score` — Barmar, Mar 16 '22 at 07:34

How would I make looping through elements more performance friendly? (Python)

0 Answers0