2

given letters: example of letters

letters = 'hutfb' 

I am given a file with a list of words.

I need to write a recursive function that allows me to check all possibilities the letters can make. If the possibility is in the list of words from the file, I need to print that specific word.

so for letters given

they can create the words:

  • a
  • cat
  • ac
  • act
  • cab

and so on and on

each combination the letters make I need to check the file to see if its a valid word. if it is I need to print them.

I don't know how start to write this function.

thecode
  • 23
  • 3
  • 1
    Your first step is to clarify the requirements. Can a letter be repeated in the word? For example, could `tat` be acceptable for your given example? – Rory Daulton May 31 '18 at 22:58
  • no it can not unless the letters = t t a c – thecode May 31 '18 at 23:44
  • Seems like you are looking subsets of an array. Take a look at https://stackoverflow.com/questions/26332412/python-recursive-function-to-display-all-subsets-of-given-set – Teoman shipahi Jun 01 '18 at 01:20
  • @Teomanshipahi: The OP is looking at all permutations of all non-empty subsets of an array. – Rory Daulton Jun 01 '18 at 08:11

4 Answers4

1

Unfortunately I cannot help just now with a recursive function, but given that a higher count of letters/characters can easily explode into billions of potential combinations if not filtered during creation I have a quirky alternative by iterating over the known words. Those have to be in memory anyway.

[EDIT] Removed the sorting as it does not really provide any benefit, fixed an issue where I falsely set to true on iteration

# Some letters, separated by space
letters = 'c a t b'
# letters = 't t a c b'

# # Assuming a word per line, this is the code to read it
# with open("words_on_file.txt", "r") as words:
#     words_to_look_for = [x.strip() for x in words]
#     print(words_to_look_for)

# Alternative for quick test
known_words = [
    'cat',
    'bat',
    'a',
    'cab',
    'superman',
    'ac',
    'act',
    'grumpycat',
    'zoo',
    'tab'
]

# Create a list of chars by splitting
list_letters = letters.split(" ")

for word in known_words:
    # Create a list of chars
    list_word = list(word)
    if len(list_word) > len(list_letters):
        # We cannot have longer words than we have count of letters
        # print(word, "too long, skipping")
        continue

    # Now iterate over the chars in the word and see if we have
    # enough chars/letters
    temp_letters = list_letters[:]

    # This was originally False as default, but if we iterate over each
    # letter of the word and succeed we have a match
    found = True
    for char in list_word:
        # print(char)
        if char in temp_letters:
            # Remove char so it cannot match again
            # list.remove() takes only the first found
            temp_letters.remove(char)
        else:
            # print(char, "not available")
            found = False
            break

    if found is True:
        print(word)

You could copy&paste a product function from the itertools documentation and use the code provided by ExtinctSpecie, it has no further dependencies, however I found without tweaking it returns all potential options including duplications of characters which I did not immediately understand.

def product(*args, repeat=1):
    # product('ABCD', 'xy') --> Ax Ay Bx By Cx Cy Dx Dy
    # product(range(2), repeat=3) --> 000 001 010 011 100 101 110 111
    pools = [tuple(pool) for pool in args] * repeat
    result = [[]]
    for pool in pools:
        result = [x+[y] for x in result for y in pool]
    for prod in result:
        yield tuple(prod)
d parolin
  • 194
  • 6
  • As you ultimately only want to print the possible words from letters that also show up in a list of actually/correct/known words I chose this path.To be such a match I iterate over each word that could be valid and then check if all of these characters are contained in the initial letters, to make it easier sorted. Still think this will yield the result asked for. You will not be able to know all possible words from the letters provided without having such a "dictionary". – d parolin Jun 01 '18 at 15:52
  • Actually I just retested and it does not work exact enough, e.g. my code will accept 'tabb' although there is only one 'b' and I assume an issue surrounding the removal of characters. Will try to provide an update + make sure you can see all permutations in case you need them. – d parolin Jun 01 '18 at 17:47
  • Thanks for the clarification, but how can you know it is a potential word without either using the known-words file in the first place or use another "algorithm" that will guess it could be a word. The function that will work with the letters would have to have some sort of knowledge what could be a word before it selectively can check against your word file. If this is important, I am sure it can be done, but it will definitely be much more complicated than iterating over even a million known words. – d parolin Jun 01 '18 at 21:19
  • the file has a list of valid words. so if the combination is in the file then that means its a valid word. – thecode Jun 01 '18 at 21:21
  • Could you elaborate if you have memory limitations (like must be running on embedded MicroPython with 512K memory or something similar ? As I cannot comment yet on @cdlane answer, I would like to understand at what size this will break. Given that answer it should actually only have a line (word) at a time in memory. If you tried and have an error message let us know. – d parolin Jun 01 '18 at 21:25
  • Sorry that the solution does not seem to work for you, given your comment I still think my and @cdlane answer will solve that problem accurately, if knowing the words in your file that the letters provided can make up is the real desired result you are aiming for. – d parolin Jun 01 '18 at 21:27
  • what does ur product function do? – thecode Jun 02 '18 at 02:28
  • I just copied it from the python documentation as stated in my answer, it resembles what itertools.product will do and could be used more in @ExtinctSpecie code not mine. It will produce every possible combination of a given series of chars, but that includes all duplications too, so it will not work as you expect without modification – d parolin Jun 02 '18 at 10:55
1

I agree with @dparolin about processing the words file to see if words conform to the letters, not generating possible words and seeing if they are in the file. This allows us not to read the file into memory, as we only need to examine one word at a time. And it can be done with a recursive test:

letters = 'catbt'

def is_match(letters, word):

    if not word:
        return True

    if not letters:
        return False

    letter = letters.pop()

    if letter in word:
        word.remove(letter)

    return is_match(letters, word)

with open('words.txt') as words:
    for word in words:
        word = word.strip()

        if is_match(list(letters), list(word)):
            print(word)

EXAMPLE USAGE

% python3 test.py
act
at
bat
cab
cat
tab
tact
%

And we should be able to work with lots of letters without issue.

cdlane
  • 40,441
  • 5
  • 32
  • 81
  • this works if the file is little but if it is too long text file, you get an error. – thecode Jun 01 '18 at 20:51
  • @thecode, I tested this with a file containing over 200,000 words, one per line. What format and size file are you working with? – cdlane Jun 01 '18 at 21:52
  • @thecode, I retested this code with a file containing over 300,000 words and the alphabet tripled as `letters` and it ran fine. The recursion depth shouldn't be any more than the shorter of `letters` or `word`, which would have to be close to 1,000 characters to cause a `RecursionError`. Did you modify the code? What are you using for `letters`? Is there some sort of deviant entry in the file tripping the code? – cdlane Jun 02 '18 at 04:59
0
import itertools
str = "c a t b"
letters = list(str.replace(" ",""))
words_to_look_for = []

for index, letter in enumerate(letters):
    keywords = [''.join(i) for i in itertools.product(letters, repeat = index+1)]
    words_to_look_for.extend(keywords)

print(words_to_look_for)

https://stackoverflow.com/questions/7074051/....

0

As noted above, this will not be feasible for any more than the number of letters countable on your two hands. There's just too many possibilities to check. But if you were to try this, here's how the code would look.

letters = ['a', 'b', 'c']

def powerset(letters):
    output = [set()]
    for x in letters:
        output.extend([y.union({x}) for y in output])
    return output

for subset in powerset(letters):
    for potential_word in map(''.join, itertools.permutations(list(subset))):
        # Check if potential_word is a word

This won't try words with duplicate letters (that would be another layer of insanity), but it will try all possible potential words that could be formed by a subset of the letters you give in any order.

[edit] Just realized you requested a recursive solution. Dunno if that's required or not, but the powerset function could be changed to be recursive. I think that would make it uglier and harder to comprehend, though.

BowlingHawk95
  • 1,518
  • 10
  • 15