breadth first in python - computational max?

Question

I'm trying to implement the following "game" in python:

Given a start word, find the goal word with modifications allowed modification per step: remove or add any letter and permute

The task is to find the way from start to goal with the least steps.

My approach was to add/remove a letter, permute the resulting letters and look up each permutant in a dictionary.

This quickly results in long run times (for a 9-letter word and the first step approx 60sec).

Here is my code.

import time
from itertools import permutations
startword = 'croissant'

nodes = list()

nodes.append(startword)

dicts = set([line.rstrip('\n') for line in open('wordList.txt')])
alpha = set([chr(i) for i in range(ord('a'),ord('z')+1)])


def step(nodes):
    nnodes = list()
    for word in nodes:
        for s in word:
            new_word = word.replace(s, '', 1)
            perms = [''.join(p) for p in permutations(new_word)]
            for per in perms:
                if per in dicts:
                    nnodes.append(per)
       for s in alpha:
            new_word = word + s
            perms = [''.join(p) for p in permutations(new_word)]
            for per in perms:
                if per in dicts:
                    nnodes.append(per)
return set(nnodes)


btime = time.time()
step(nodes)
print time.time() - btime

How can I improve the performance/logic? We are specifically asked to use a breadth first search.

Does this code work? If yes, and you just want to ask how it is possible to _improve_ it, you're welcome to ask a question on [codereview.se]. — ForceBru, Nov 05 '16 at 16:33
yes, it does. my question was more about the concept I use.... — Leon, Nov 05 '16 at 16:42

CodeSurgeon · Accepted Answer · 2016-11-05T17:54:48.983

Interesting question! There is a way that you can dramatically improve the performance. Instead of checking each new word by traversing through the wordlist and checking if it is present, you can alphabetize the characters for each word in the wordlist and compare those directly to the alphabetized shortened/lengthened word of interest.

For example using croissant, I can alphabetize the characters to generate acinorsst instead. Then, I check to see if acinorsst with a character removed or added is in a defaultdict of lists where the keys are alphabetized strings and retrieve the corresponding value (which is the list of actual words corresponding to that alphabetical ordering).

It is much less confusing to explain in code, so I have posted it below:

from collections import defaultdict
import itertools as its
import time

def get_alpha_dicts():
    wa_dict = {}#word: alphabetized dict
    aw_dict = defaultdict(list)#alphabetized: corresponding WORDS dict
    for line in open('wordList.txt'):
        word = line.rstrip('\n')
        alpha_word = ''.join(sorted(word))
        wa_dict[word] = alpha_word
        aw_dict[alpha_word].append(word)
    return wa_dict, aw_dict    

def step(nodes, aw_dict):
    alpha = set([chr(i) for i in range(ord('a'),ord('z')+1)])
    nnodes = list()
    for word in nodes:
        alpha_word = ''.join(sorted(word))
        #remove a char from word
        short_words = set([''.join(w) for w in its.combinations(alpha_word, len(start_word)-1)])
        for short_word in short_words:
            nnodes.extend(aw_dict[short_word])
        #add a char to word
        long_words = [''.join(sorted(alpha_word + s)) for s in alpha]
        for long_word in long_words:
            nnodes.extend(aw_dict[long_word])
    return set(nnodes)

if __name__ == "__main__":
    start_word = 'croissant'
    nodes = list()
    nodes.append(start_word)
    wa_dict, aw_dict = get_alpha_dicts()
    btime = time.time()
    print step(nodes, aw_dict)
    print time.time() - btime

Hope it helps!

In that case, accept the answer when you get a chance. That way it does not remain in the list of unanswered questions on the site. — CodeSurgeon, Nov 05 '16 at 19:00
Any ideas on how to best "track" the way the algorithm takes to a given word, just logging the intermediate steps in the search tree? — Leon, Nov 05 '16 at 22:37
Creating a simple tree structure would probably be the most straightforward way. [This](http://stackoverflow.com/a/3010038/2588654) answer has a pretty succint way to walk through a tree recursively. In your case, you could make a Node class with two properties, `word` and `children`. — CodeSurgeon, Nov 05 '16 at 22:56

breadth first in python - computational max?

1 Answers1