0

I am writing a python script that looks for all possible combinations of a given string of letters and then look them up in an English dictionary to generate words from list of strings.

Example input: roaispnba Example output: soap brain

This is the code I got for now:

import sys
import itertools 

list_of_letters = sys.argv[2].lower()
def iterator(list_of_letters):
    for word in itertools.permutations( list_of_letters ):
        output = ''.join(word)
        with open('words_alpha.txt') as file:
            for line in file:
                if output in line:
                    print line

When I give it a short string (e.g. niarb) then it will look search in the dictionary and come up with the word "brain". However, it is not be able to separate the two words "soap" and "brain" when I give it the whole string.

Please note that what I need is:

  1. Get all possible combinations of the random string
  2. See if any of the combinations will result in an English word or maybe to or three

So, two questions:

  1. How can I make such code able to find the two separate words
  2. This code probably look horrible and inefficient for programmers, any advice in doing such a task in a better/neater way?

Thaaanks!

Note:

Here is a similar question on SO but in C#: Find words in wordlist from random string of characters

Update:

Here is the link for the dictionary used (Github): https://github.com/dwyl/english-words/blob/master/words_alpha.txt

and here is a sample:

braies
brayette
braying
brail
brailed
brailing
braille
brailled
brailler
brailles
braillewriter
brailling
braillist
brails
brain
brainache
braincap
braincase
brainchild

1 Answers1

0

I think the best way to go is to scan the dictionary using recursion:

dictionary = [i.strip('\n') for i in open('dictionary.txt')]
s = "roaispnba"
import itertools
import random
def get_words(new_s, word_list):
    if not new_s:
       return word_list
    else:
        possibilities = [i for i in dictionary if all(new_s.count(b) >= i.count(b) for b in i)]
        print possibilities
        if not possibilities:
              return get_words([], word_list)
        else:
           word = random.choice(possibilities)
           word_list.append(word)
           word_dict = {i:word.count(i) for i in word}
           new_final_word = list(itertools.chain.from_iterable([[a for i in range(abs(s.count(a)-b))] for a, b in word_dict.items()]))
           return get_words(new_final_word, word_list)

final_words = get_words(s, [])
print(final_words)
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
  • Thank you very much! May I ask for further clarification? Because I am a little bit confused about the difference between dictionary and word_list. Also, I tried to copy/paste it but there are no results (although the dictionary contains the words mentioned). Is it because I still have to call the function? What arguments am I suppose to pass if yes? Thanks again! – He-is-learning Oct 13 '17 at 19:17
  • Thanks! The code will always return [] instead of actual words, shall I try to change the [] in final words = get_words(s, [])? – He-is-learning Oct 13 '17 at 19:25
  • @He-is-learning Do you mean that the code is not returning any words? If so, could you post a sample of your dictionary in your above post? – Ajax1234 Oct 13 '17 at 19:27
  • @He-is-learning Actually, nevermind, I found a small bug in the code. Can you run it again? – Ajax1234 Oct 13 '17 at 19:28
  • I re-ran it but still the same thing. It only returns the brackets. I posted a Github link for the dictionary together with a sample in the questions – He-is-learning Oct 13 '17 at 19:33
  • @He-is-learning I made several alterations and now the code works. However, its output is slightly different from yours, because it is randomly choosing which words to use. If you do not want to do that, you will have to determine what additional rules you wish to use, such as word length, starting characters, etc. to weed through the many possibilities. – Ajax1234 Oct 13 '17 at 19:50