I'm trying to create a code to find the optimal solution for scrabble. And i need help with generating all of the word combinations with a set of letters. (To clarify: i do not need help validating if the words are in a dictionary or not. I just want to generate all of the different combinations). Anyway, in scrabble there is a blank square/wild card which can be any of the valid letters, in my python code it is represented by '_'. Since the wild cards value is zero its important to keep track of what words originates from a wild card, therefore all wildcard words in my code is uppercase. It is also imported to keep in mind that the function does not return a list with any words that have a uppercase in it that can be also written in lowercase with the same set of letters, since i only want to keep highest scoring word combinations. Also keep in mind that it is possible to get multiple wild cards in scrabble (Though unlikely).
This is my fastest version of the code so far. But still in a worst case scenario with 5 letters and 2 wild cards it takes over a minute to run. Most of this time is spend removing the uppercase duplicates, so i have focused my optimization effort on that part. So far I have added numba, binary search (to speed up the processes of checking if the lowercase word already exists) and list comprehension but i still only got none to minor improvement. I think the only possible way to speed up my code is to use a another algorithm.
import numpy as np
import itertools, numba, time
def find_all_word_combinations(letters, wild_card='_', valid_letters='abcdefghijklmnopqrstuvwxyzæøå'):
valid_letters = valid_letters.upper()
return remove_uppercase_duplicates([
''.join(permutation) for replaced_words in replace_wild_card_with_letters(letters, wild_card, valid_letters)
for length in range(len(replaced_words))
for combination in itertools.combinations(replaced_words, length + 1)
for permutation in itertools.permutations(combination)])
def replace_wild_card_with_letters(word, wild_card='_', valid_letters='abcdefghijklmnopqrstuvwxyzæøå'):
if wild_card not in word:
return [word]
valid_letters = valid_letters.upper()
where = [i for i, x in enumerate(word) if wild_card == x]
blanks_filled = [''.join(perm) for perm in itertools.combinations_with_replacement(valid_letters, word.count(wild_card))]
result = []
for letters in blanks_filled:
new_word = list(word)
for index, letter in enumerate(letters):
new_word[where[index]] = letter
result.append(''.join(new_word))
return result
@numba.njit(fastmath=True)
def binary_search_check(arr:list, item:str)->bool:
left = 0
right = len(arr) - 1
while left <= right:
mid = (left + right) // 2
if arr[mid] == item:
return True
elif arr[mid] > item:
right = mid - 1
else:
left = mid + 1
return False
@numba.njit(fastmath=True)
def remove_uppercase_duplicates(words):
words = np.sort(words)
return [word for word in words if not binary_search_check(words, word.lower())]
start = time.perf_counter()
print(len(find_all_word_combinations('abcde__')))
end = time.perf_counter()
print(f'Finished in {end-start} seconds.')
Edit: The code i first provided is wrong, the remove_uppercase_duplicates function does not function properly. The function is supposed to delete all duplicates and remove all uppercase words that have a lower case duplicate. Since someone has already posted a better answer i'm not going to fix the original code (i am also to lazy ¯\(ツ)/¯). Feel free to fix my error and post it. I also fixed some grammar.