4

Recently I had an interview and I was asked to write a algorithm to find the minimum number of 1 letter changes to get from a particular of word to a given word , i.e. Cat->Cot->Cog->Dog

I dont want the solution of the problem just guide me through How I can use BFS in this algorithm ?

Max
  • 9,100
  • 25
  • 72
  • 109
  • I think you're looking for a better algorithm than BFS. – irrelephant Aug 04 '12 at 20:51
  • A standard BFS implementation will be fine. Just keep in mind for a BFS: "[Priority] Queue" .. otherwise, what have you tried? (See the standard [wikipedia articles](http://en.wikipedia.org/wiki/Breadth-first_search) for overviews .. which explain a BFS implementation rather simply.) –  Aug 04 '12 at 20:57
  • 2
    Are you sure there are not other requirements? Otherwise you can just scan the word, flipping characters as required. – akappa Aug 04 '12 at 21:10
  • This is confusing ? Are the two words of equal length ? What's wrong with akappa's solution ? – Razvan Aug 04 '12 at 21:44
  • Do the intermediate "words" need to be actual words from a given dictionary? Or would "Cat->Dat->Dag->Dog" be acceptable, even though "Dat" is not a word? – j_random_hacker Aug 05 '12 at 01:12
  • duplicate http://stackoverflow.com/questions/1521958/shortest-path-to-transform-one-word-into-another/12103304#12103304 – Rusty Rob Aug 24 '12 at 04:55

4 Answers4

6

according to this scrabble list, the shortest path between cat and dog is: ['CAT', 'COT', 'COG', 'DOG']

from urllib import urlopen

def get_words():
    try:
        html = open('three_letter_words.txt').read()
    except IOError:
        html = urlopen('http://www.yak.net/kablooey/scrabble/3letterwords.html').read()
        with open('three_letter_words.txt', 'w') as f:
            f.write(html)

    b = html.find('<PRE>') #ignore the html before the <pre>
    while True:
        a = html.find("<B>", b) + 3
        b = html.find("</B>", a)
        word = html[a: b]
        if word == "ZZZ":
            break
        assert(len(word) == 3)
        yield word

words = list(get_words())

def get_template(word):
    c1, c2, c3 = word[0], word[1], word[2]
    t1 = 1, c1, c2
    t2 = 2, c1, c3
    t3 = 3, c2, c3
    return t1, t2, t3

d = {}
for word in words:
    template = get_template(word)
    for ti in template:
        d[ti] = d.get(ti, []) + [word] #add the word to the set of words with that template

for ti in get_template('COG'):
    print d[ti]
#['COB', 'COD', 'COG', 'COL', 'CON', 'COO', 'COO', 'COP', 'COR', 'COS', 'COT', 'COW', 'COX', 'COY', 'COZ']
#['CIG', 'COG']
# ['BOG', 'COG', 'DOG', 'FOG', 'HOG', 'JOG', 'LOG', 'MOG', 'NOG', 'TOG', 'WOG']

import networkx
G = networkx.Graph()

for word_list in d.values():
    for word1 in word_list:
        for word2 in word_list:
            if word1 != word2:
                G.add_edge(word1, word2)

print G['COG']
#{'COP': {}, 'COS': {}, 'COR': {}, 'CIG': {}, 'COT': {}, 'COW': {}, 'COY': {}, 'COX': {}, 'COZ': {}, 'DOG': {}, 'CON': {}, 'COB': {}, 'COD': {}, 'COL': {}, 'COO': {}, 'LOG': {}, 'TOG': {}, 'JOG': {}, 'BOG': {}, 'HOG': {}, 'FOG': {}, 'WOG': {}, 'NOG': {}, 'MOG': {}}

print networkx.shortest_path(G, 'CAT', 'DOG')
['CAT', 'OCA', 'DOC', 'DOG']

As a bonus we can get the farthest:

print max(networkx.all_pairs_shortest_path(G, 'CAT')['CAT'].values(), key=len)
#['CAT', 'CAP', 'YAP', 'YUP', 'YUK']
Rusty Rob
  • 16,489
  • 8
  • 100
  • 116
  • Thanks Dude!!! I had to install python to run it but it was worth it .. awesome algo :D – Max Aug 05 '12 at 09:28
  • 5
    The Homework tag means you should give guidelines, and NOT complete answers, as written in the tag description: `This lets potential answerers know that they SHOULD guide the student in solving the problem, and SHOULD NOT simply show the complete answer.` Please avoid doing so in the future – amit Aug 05 '12 at 10:47
  • Does CAT->OCA or OCA->DOC constitute a single letter change? I was under the assumption that this was a "word ladder" and there was no swapping of order, i.e. Hamming distance and such. – Maple Mar 18 '13 at 21:22
  • Maple, you're right. for CAT, t1 = c1, c2 = C, A. for OCA, t3 = c2, c3 = C, A. t1 = t3, so it matches them. If you want to avoid t1 matching t3, add an index to each template: t1 = 1, c1, c2 t2 = 2, c1, c3 t3 = 3, c2, c3 – Rusty Rob Mar 18 '13 at 21:43
  • we then get ['CAT', 'COT', 'COG', 'DOG'] – Rusty Rob Mar 18 '13 at 21:43
5

At first sight I thaught about Levenshtein distance but you need to use BFS. So I think that you should start from building tree. Given word should be root and then next nodes are words with changed first letter. Next next nodes have changed second letter. When you build the graph you use BFS and when you found new word store the path length. At the end of algorithm choose minimal distance.

janisz
  • 6,292
  • 4
  • 37
  • 70
0
  1. Begin with just the starting word in your path set.

  2. If the ending word of any path in your path set is the desired word, stop, that path is the desired path.

  3. Replace each path in your path set with every possible path that starts with that path but is one word longer.

  4. Go to step 2.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
0

If we start to build a directed acyclic graph from the destination word to the source word, in a breadth-wise fashion, and we do a dictionary look-up to verify if we have seen the word earlier in the tree while adding the word, then the first occurrence of the source word,should give the shortest path in the reverse direction from the 'target word' to the 'source word'.

From this we can print the path from the 'source' to the 'target'

Arvind
  • 466
  • 3
  • 9