0

I'm trying to write an algorithm that by given to it a bunch of letters is giving you all the words that can be constructed of the letters, for instance, given 'car' should return a list contains [arc,car,a, etc...] and out of it returns the best scrabble word. The problem is in finding that list which contains all the words. I've got a giant txt file dictionary, line delimited and I've tried this so far:

def find_optimal(bunch_of_letters: str):
words_to_check = []
c1 = Counter(bunch_of_letters.lower())

for word in load_words():
    c2 = Counter(word.lower())
    if c2 & c1 == c2:
        words_to_check.append(word)

max_word = max_word_value(words_to_check)
return max_word,calc_word_value(max_word)

max_word_value - returns the word with the maximum value of the list given

calc_word_value - returns the word's score in scrabble.

load_words - return a list of the dictionary.

I'm currently using counters to do the trick but, the problem is that I'm currently on about 2.5 seconds per search and I don't know how to optimize this, any thoughts?

S.Haviv
  • 311
  • 2
  • 14

2 Answers2

0

Try this:

def find_optimal(bunch_of_letters):

    bunch_of_letters = ''.join(sorted(bunch_of_letters))

    words_to_check = [word for word in load_words() if ''.join(sorted(word)) in bunch_of_letters]

    max_word = max_word_value(words_to_check)

    return max_word, calc_word_value(max_word)

I've just used (or at least tried to use) a list comprehension. Essentially, words_to_check will (hopefully!) be a list of all of the words which are in your text file.

On a side note, if you don't want to use a gigantic text file for the words, check out enchant!

Adi219
  • 4,712
  • 2
  • 20
  • 43
  • @ElvirMuslic Actually, this **does** help the OP as it provides a different way to achieve what they want (which is more optimised (which is what they're asking for)). – Adi219 Apr 22 '18 at 11:01
  • @S.Haviv It would compile when you have the required dataset whicht the OP has (list comprehensions use variables which aren't predefined) – Adi219 Apr 22 '18 at 11:03
  • Yeah but you might mean for 'word' when you wrote 'words'? common for word in load_words() if ''.join(sorted(**words**)) – S.Haviv Apr 22 '18 at 11:04
  • @Adi219 We'll have to wait for OP's feedback on what exactly he's looking for as his code does not represent the minimal code for reproduction of the issue. The example wasn't quite clear however from the text above this is what I concluded (my answer) check it out let me know what you think! – innicoder Apr 22 '18 at 11:22
0
from itertools import permutations

theword = 'car' # or we can use input('Type in a word: ')

mylist = [permutations(theword, i)for i in range(1, len(theword)+1)]


for generator in mylist:
    for word in generator:
        print(''.join(word)) 
    # instead of .join just print (word) for tuple

Output:

c

a

r

ca

cr

... ar rc ra car cra acr arc rca rac

This will give us all the possible combinations (i.e. permutations) of a word. If you're looking to see if the generated word is an actual word in the English dictionary we can use This Answer

import enchant
d = enchant.Dict("en_US")
for word in mylist:
    print(d.check(word), word)

Conclusion:

If want to generate all the combinations of the word. We use this code:

from itertools import combinations, permutations, product

word = 'word' # or we can use input('Type in a word: ')

solution = permutations(word, 4)

for i in solution:
    print(''.join(i)) # just print(i) if you want a tuple
innicoder
  • 2,612
  • 3
  • 14
  • 29
  • 1
    The permutations function isn't an option, if the string length is above 10 it takes forever. – S.Haviv Apr 22 '18 at 11:13
  • 1
    This will not work in OP's case. Notice that in OP's example, the word car was given and "a" was a possible suggestion. – ninesalt Apr 22 '18 at 11:13
  • If you run the code @ninesalt you'll see that it works, just instead of `'word'` replace it with car. There you go, adjusted for the OP's code – innicoder Apr 22 '18 at 11:17
  • @S.Haviv you're quite right sir, do you have any suggestions? – innicoder Apr 22 '18 at 11:23
  • Don't use `permutations` as this algorithm **definitely isn't** the most efficient way – Adi219 Apr 22 '18 at 11:34
  • I don't agree on that@Adi219, I can probably come up with an algorithm on my own, however, I'm 200% sure that someone has given a lot of thought into making it efficient, especially as it's a part of Python modules that are pretty much built-in. Let me know if you have a solution, I'd love to see it work. Note that I've edited my answer to make it work on longer words (similar to a generator) instead of just having a big list and then working with it. – innicoder Apr 22 '18 at 11:41
  • @ElvirMuslic Sorry (my previous comment was ambiguous), I meant that using `permutations` isn't the most efficient way **for this task** (not that `permutations` itself isn't efficient ) – Adi219 Apr 22 '18 at 11:43
  • @Adi219 Oh, quite so. I'd love to see someone actually solve this.@S.Haviv please adjust your question to a single specific point, I'll then upvote it and more StackOverFlowers will come and help you out. However neither I nor Adi can run your code to reproduce the problem, I myself am just guessing that this was your issue, you haven't confirmed it. – innicoder Apr 22 '18 at 11:47
  • 1
    @ElvirMuslic I don't believe the OP has an actual runtime issue, but rather an optimisation issue (2.5 seconds per query). The main problem we face is that the OP is using a custom class (`Counter`), but they haven't shown its definition. I believe that it's likely that `Counter` is where the optimisation issue lies, but I can't ascertain that as the OP hasn't posted its definition – Adi219 Apr 22 '18 at 12:01
  • I agree we'll have to wait for feedback I'll ask this question in a half an hour if the OP doesn't respond just because I'm curious about the issue I'm facing currently ( if the len of the word is > 10). – innicoder Apr 22 '18 at 12:07