0

I want to take a set of 9 characters which is stored in an array and get all the combinations of strings that can be made with those characters. The result is then compared to a text file and if a word matches then that word should be output back to the user.

At the moment I'm using the itertools.combinations function to try and get all the combinations - the problem I am having is that the program starts with all the 9 letter combinations and if none of these combinations matches a word from the text file then the program tries 8 letter combinations and so forth until a word is found or there are no words that match.

lengthofword = 9
    foundwords = False
    count = 0
    while foundwords == False:
        wordcombos = [''.join(i) for i in itertools.combinations(letters,lengthofword)]
        #print(wordcombos)
        #print(len(wordcombos))
        count = 0
        for i in range(0,len(wordcombos)):
            if count == len(wordcombos)-1:
                print("gone through each word")
                if lengthofword > 0:
                    lengthofword = lengthofword -1
                elif lengthofword == 0:
                    print("there are no words with those letters")
                    foundwords = True
            elif wordcombos[i] in gamewords:
                print("the best anwser is " +wordcombos[i])
                foundwords = True
            elif wordcombos[i] not in gamewords:
                count = count + 1
                print(count)

What should happen is that the 9 length combinations of the characters provided are compared to the text file - the problem is that itertools is not showing all the combinations as some letters that are passed in are not used when the lengthofword is decreased to see if there are any smaller words that match. For example if the letters RQLCWUOUI are passed in the word "curl" should be present under the 4 letter word combinations but is not there.

V-R
  • 1,309
  • 16
  • 32
  • The approach of generating all combinations and checking each against the file looks rather inefficient, I suggest iterating through the file and checking for each word whether it can be made from the 9 letters. Or is there a reason why it has to be done this way? – Pete Oct 17 '19 at 10:45
  • possibly answered here > https://stackoverflow.com/questions/464864/how-to-get-all-possible-combinations-of-a-list-s-elements – Dusan Gligoric Oct 17 '19 at 10:46
  • @DusanGligoric I had a look at that but I don't think it works - my current solution does what that post says, however, some combinations are missing. – Amrik Ahluwalia Oct 17 '19 at 10:48
  • @Pete that way sounds more efficient how would you go about programming if a word can be made up of certain letters? – Amrik Ahluwalia Oct 17 '19 at 10:49
  • @AmrikAhluwalia created a solution that only allows word that can be composed of certain letters. Noticed that the permutation solutions miss words with repeated letters. Are repeated letters allowed (i.e. basically permutations with replacement)? – DarrylG Oct 17 '19 at 11:49

3 Answers3

0

Combinations gives unique combination not taking into account order, I. E. Abc with length 2 becomes ab ac and bc but not cb. Permutations does what you want

Pete
  • 79
  • 7
0

To generate all permutations with all possible lengths, I suggest going with itertools.permutations, and itertools.chain to concatenate iterators: you don't want to generate all these permutations in memory as there are way too much possibilities.

import itertools

letters = ["R", "Q", "L", "C", "W", "U", "O", "U", "I"]

allCombinations = itertools.chain(*(("".join(l) for l in itertools.permutations(letters, length)) for length in range(1, len(letters))))

Then you can iterate through all of them by using a simple for loop:

for word in allCombinations:
    # do stuff

This can yield duplicates if your initial set of letters contains duplicates.

Note that as said in some comments, this is a very inefficient way of finding words in a text as the number of look up to do is huge.

Corentin Pane
  • 4,794
  • 1
  • 12
  • 29
0

An alternative method that does not use combinations

Should be faster and more efficient since it does not have to generate the hundreds of thousands of permutations of 9 letters

def find_matches(words, letter_list):
  " finds words in text that only use letters from letter_list "

  letters = set(list(letter_list))    # make word list into a set of letters
  # Find words that only use letter from letter_list and are less than 9 characters in length
  return [word for word in words if all(len(word) <= 9 and letter in letters for letter in word)]

words = ['RQ', 'LUCI', 'RQL', 'OUILC', 'HELLO', 'RQLCWUOW','RQLCWUOUIRQLCWUOUI']

letters = "RQLCWUOUI"

print(find_matches(words, letters))

Output (matches found)

['RQ', 'LUCI', 'RQL', 'OUILC', 'RQLCWUOW']

Reasons for Non-Matches

'HELLO' - some letters not from list
'RQLCWUOUIRQLCWUOUI' more than 9 letters
DarrylG
  • 16,732
  • 2
  • 17
  • 23
  • what happens if the text you are searching through is in an Array format? – Amrik Ahluwalia Oct 17 '19 at 14:17
  • @AmrikAhluwalia are you say rather than having "test text" you would have ['t', 'e', 's', 't', ' ', 't', 'e', 'x', 't']? Is there still a delimiter between each word such as space or comma? Is so you can first have text = ''.join(arrayText), where arrayText is the array of text. – DarrylG Oct 17 '19 at 14:21
  • @DarryIG at the moment the letters that the words need to be made of are just stored as a string and all the words that are correct words are stored in an array such as ["A, Am, for all the words ...] – Amrik Ahluwalia Oct 17 '19 at 14:29
  • @AmrikAhluwalia are you saying for my example: letter_list = "RQLCWUOUI" (text string) and text = ['RQ,LUCI,RQL,OUILC,HELLO, RQLCWUOW,RQLCWUOUIRQLCWUOUI''] (comma delimited) – DarrylG Oct 17 '19 at 14:38
  • @AmrikAhluwalia--made mods to match input based upon your last comment – DarrylG Oct 17 '19 at 14:51
  • @DarryIG sorry for the poor description the text_list is an array of words such that it would look like text_array = ["word1", "word2" , "etc" ] rather than one long string that has commas in it – Amrik Ahluwalia Oct 17 '19 at 17:36
  • @AmrikAhluwalia--no problem, check the new answer. – DarrylG Oct 17 '19 at 17:48
  • @DarryIG thanks for your help its really appreciated – Amrik Ahluwalia Oct 17 '19 at 19:50