All permutations are showing not just those that are English

Question

I'm trying to find a simple way to solve an anagram and display those anagrams that are English words on the return page. Currently this shows the permutations on the solver page and somewhat works but I'd like to show those that are actual words only.

Any advice is greatly appreciated.

@app.route('/', methods=['GET', 'POST'])
def index():
    if request.method == 'GET':
        return render_template('main.html')
    else:
        myLetters = request.form['letters']
        myLetters = ''.join(sorted(myLetters))
        myLetters = myLetters.strip()
        myWords = []
        myLetterList = list(myLetters)
        lettersLength = len(myLetterList)
        myWords = [''.join(result) for result in permutations(myLetters)]

        with open("/usr/share/dict/words") as defaultWords:
            for word in myWords:
                if word not in defaultWords:
                    myWords.remove(word)

        return render_template('solver.html', myLetters = myLetters, myWords = myWords)

before displaying , just tally the words with standard English Dictionary , if a match is found then only display it. — ZdaR, Apr 21 '15 at 09:27
In case it's not totally clear from Kos's answer, after you've done `if word not in defaultWords:` with your first word, you're at the end of the `defaultWords` file, and the file pointer isn't automatically re-positioned to the start of the file. So subsequent `in` tests will fail because you're basically testing if the word's in an empty line. BTW, it's not a good practice to `remove` from a list you're iterating over. It works ok here, but in general it's better to iterate over a copy of the list (eg myWords[:]). — PM 2Ring, Apr 21 '15 at 10:58

score 1 · Answer 1 · edited May 23 '17 at 11:57

1

Here's the problem:

if word not in defaultWords:

Using the in operator for a file has an unexpected result.

Files don't support __contains__, but they act like sequences of lines, so if word in file just iterates over the lines and has unintended effects:

In [1]: f = open('/usr/share/dict/words')

In [2]: 'black\n' in f
Out[2]: True

In [3]: 'black\n' in f
Out[3]: False

In [4]: f.seek(0)

In [5]: 'black\n' in f
Out[5]: True

Instead, make a set of all the words in the file (using strip to clear extra whitespace):

with open('/usr/share/dict/words') as f:
    words = set(line.strip() for line in f)

and use words for lookup.

Edit: once you have the set you might be tempted to do something like:

for word in myWords:
    if word not in words:
        myWords.remove(word)

but editing the list while iterating over it is a bad idea. Instead you can iterate over a copy:

for word in list(myWords):
    if word not in words:
        myWords.remove(word)

and voila, it works. But hey, words is a set now, so why bother with a loop? You can use set.intersection and simply say:

return words.intersection(myWords)

Exercise: how to avoid keeping the whole list of permutations myWords in memory at once?

edited May 23 '17 at 11:57

Community

1
1

answered Apr 21 '15 at 09:33

Kos

70,399
25
169
233

Thanks @Kos, I'm still seeing all perms with your suggestions in place. If you have any further advice I'd appreciate it.
`with open("/usr/share/dict/words") as defaultWords: words = set(line.strip() for line in defaultWords) for word in words: if word not in words: myWords.remove(word) defaultWords.seek(0)` Sorry for the formatting, I'm not understanding the comment format rules yet. – mnickey Apr 21 '15 at 22:06
@mnickey That's another problem :-) I've expanded my answer to cover this – Kos Apr 22 '15 at 08:42
@mnickey: You _can't_ format Python code in comments properly, so it's best restricted to single-line snippets. You can put code `into` `multiple` `segments`, but that doesn't really help much. It's best to just _mention_ the code in the comment and add the code itself to the bottom of your question (maybe separated from the original material using `
`). – PM 2Ring Apr 23 '15 at 11:40

score 0 · Answer 2 · answered Apr 22 '15 at 23:24

Thanks again @Kos. I resolved this a bit differently. While it's not super pretty it works. I had to change /usr/share/dict/words to a file included in the package during deployment but other then that it works. you can see it in action here if you like anagrams.mnickey.com or the repo here github.com/mnickey/anagrams

""" This is setting up the control dictionary to read against """
from collections import defaultdict
words = defaultdict(list)
with open("dictionary") as f:
    for word in f:
        word=word.strip()
        words[''.join(sorted(word))].append(word)

@app.route('/', methods=['GET', 'POST'])
@app.route('/anagrams/', methods=['GET', 'POST'])
def index():
    if request.method == 'GET':
        return render_template('main.html')
    else:
        #this is the original set of letters that I want anagrams for
        myLetters = request.form['letters']
        # some cleanup on those letters
        myLetters = ''.join(sorted(myLetters))
        # then assign those letters to 'word'
        word = myLetters.strip().lower()

        """ This is where I need to check the letter sets against the control group """
        myWords =  words[''.join(sorted(word))]
        return render_template('solver.html', myLetters = myLetters, myWords = myWords)

All permutations are showing not just those that are English

2 Answers2