3

I wrote the following algorithm to complete a word ladder. You can find the full program with the jQuery visualization and the json dictionary on my github here: https://github.com/NateBrady23/algorithms/tree/master/word_ladder

So I have very little knowledge of what the names of the algorithms I'm using are, and the names of the data structures I'm building out other than I'm filling some arrays with each level of Nodes.

I start at a given input word and end word. I eliminate all words in the dictionary that I can't use (the input word and words not the same length as the input word). I build a Node object that holds the jump level it's on, a value, and a prev so I can traverse backward from my endpoint if I find a path.

Starting at level 0, my input word, I find all words that could be created from that word by transforming one letter and save it into a bucket of level 1 wordNodes. Each of those Node objects prev property point back to the level 0 node it was found from. I then delete all of those words from the dictionary. I then go through all the level 1 wordNodes and for each one, find the transformed word from the modified dictionary and store it in level 2 and point back to the word that created it and so on until I find the target word or no new levels can be created (no path to my word).

So given that probably poorly worded summary, is there a name for this type of structure I've built (like a binary search tree?) and the algorithm I used to get there. Also, is there a way to significantly improve the code below without using an entirely different structure/algorithm.

<style>
    span {
        margin: 10px;
    }

    .highlight {
        background-color: yellow;
        font-weight: bold;
    }
    span {
        display: inline-block;
    }
</style>
<div id="path">

</div>

<script src='js/jquery.min.js'></script>
<script src='js/dictionary.js'></script>
<script>
// var dict = JSON.parse('{"aa" : true, "aah" : true, "aahed" : true}');

    // Build our dictionary array
    // Create a Node object so we can traverse back up the tree
    function Node(value, prev, level) {
        this.prev = prev;
        this.level = level;
        this.value = value;
    }

    function getRegex(word) {
        var temp = [];

        for(var i = 0; i < word.length; i++){
            temp.push( word.substring(0,i) + '.' + word.substring(i+1) );
        }  
        var re_str = '^(' + temp.join('|') + ')$';
        return new RegExp(re_str, 'i');
    }

    function oneOffs(wordNode) {
        var list = [];
        var regex = getRegex(wordNode.value);

        if ($('#' + (wordNode.level+1)).length == 0) {
            $('#path').append('<div id="' + (wordNode.level+1) + '"></div>');
            $('#' + (wordNode.level+1)).append('<h1>Level: ' + (wordNode.level+1));
        }
        for (var word in dict) {
                // If we found a match
                if (dict.hasOwnProperty(word) && regex.test(word)) {
                    list.push(new Node(word, wordNode, wordNode.level + 1));
                    $('#' + (wordNode.level+1)).append('<span id="' + word + '">' + word +'</span>');
                    delete dict[word];
                }
        }
        return list;
    }

    // Get our start and end words
    inputWord = 'boggle';
    endWord = 'shaggy';

    $('#path').append('<div id="start"><h1>Transform Case</h1><span><b>Starting Word:</b> ' + inputWord + '</span><span><b>Ending Word:</b> ' + endWord + '</span></div>');
    $('#path').append('<div id="0"><h1>Level: 0</h1><span id="' + inputWord + '">' + inputWord + '</span></div>');

    // Have to remove our inputWord from the dictionary
    delete dict[inputWord];

    // Remove all words in dictionary not of proper length
    for (var word in dict) {
        if (dict.hasOwnProperty(word)) {
            if (word.length != inputWord.length) {
                delete dict[word];
            }
        }
    }


    // Create input and end nodes
    inputNode = new Node(inputWord, null, 0);
    endNode = new Node(endWord, null, 0);

    // Create our initial list of level 1 nodes
    list = oneOffs(inputNode);

    for (var node in list) {
        if (list[node].value === endNode.value) {
            endNode.level = list[node].level;
            endNode.prev = list[node].prev;
        }
    }

    // Build our tree
    while (list.length && !endNode.level) {
        newList = [];
        for (var i = 0; i < list.length; i++) {
            newNodeList = oneOffs(list[i]);
            for (var node in newNodeList) {
                if (newNodeList[node].value === endNode.value) {
                    endNode.level = newNodeList[node].level;
                    endNode.prev = newNodeList[node].prev;
                    break;
                }
            }
            if (newNodeList.length) {
                newList = newList.concat(newNodeList);
            }
        }
        list = newList;
    }

    // if we found the path, let's traverse back up and print out all the values
    if (endNode.level) {
        curr = endNode;
        $('#' + curr.value).addClass('highlight');
        console.log(curr.value);
        while(curr.prev) {
            $('#' + curr.prev.value).addClass('highlight');
            console.log(curr.prev.value);
            curr = curr.prev;
        }
    } else {
        console.log('No possible path');
    }


</script>
Nate
  • 6,384
  • 3
  • 25
  • 30
  • What would an example of input word, output word and path be? Seems you are trying to do something similar (or equal?) to this: http://stackoverflow.com/questions/1521958/shortest-path-to-transform-one-word-into-another – juvian Sep 11 '15 at 18:13
  • Yes I am doing exactly that. It's the word ladder problem. I just want to know what the algorithm/structure I used is called and if it can be optimized without changing the type of algorithm/structure I've built. – Nate Sep 11 '15 at 18:21
  • 1
    Well, you are using a similar approach to BFS algorithm, start at initial node (your input word), add all its neighbours (in your case 1 letter difference words), after that, add all the neighbours of your level 1 words, and keep doing that until you either find the word or you don´t have more nodes to process – juvian Sep 11 '15 at 18:27
  • There are probably better ways with other algorithms or structures, but as long as you have enough memory this one is quite efficient. You might have some implementation optimizations though on how you tranform/detect the words – juvian Sep 11 '15 at 18:29
  • [This comment](http://stackoverflow.com/a/1522384/1243641) claims that such a breadth-first search is optimal. I don't know for sure that such is true. But that comment also offers a very readable proof that this problem is NP-hard. – Scott Sauyet Sep 11 '15 at 18:43
  • @ScottSauyet yeah, bfs is probably the optimal choice, but there could be several implementation optimizations. For instance, I don´t know how he picks the level 1 nodes, but instead of iterating over all nodes to check if the difference is 1, you could sort words and retrieve the possible combinations in letters log words time, being words the amount of words and letters the amount of letters in the input word. This is just an example, not necessarily applies to this case – juvian Sep 11 '15 at 19:04
  • 1
    Doing a two-sided BFS might help a lot if there are many searches. – David Eisenstat Sep 11 '15 at 19:29
  • 2
    @Nate might have some bugs, but here is my implementation using bfs but both from start word and end word. That way when you reach a word from both sides, that's the shortest path. Here is my code: http://jsfiddle.net/juvian/y357xudt/3/ and here is a better explanation: http://yucoding.blogspot.com/2013/08/leetcode-question-127-word-ladder.html – juvian Sep 12 '15 at 01:03

1 Answers1

3

The comments have noted that this is a breadth-first search, but I don't know of any specific name for this algorithm. And of course your data structure is some sort of tree.

As to optimizations, I haven't looked at the code very carefully, but one thing leaps out at me. To move from one step to another, you construct a complex regex to match each word at a particular level and then loop through the entire dictionary to see if any word in it matches this regex. It would certainly be more efficient to build the actual possible words and check whether each one is in the dictionary. That is, if you have foo, check your dictionary for aoo, boo, coo, ... zoo, fao, fbo, fco, ... fzo, foa, fob, foc, ... foz. Since you're already removing visited words from the dictionary, you won't get multiple new hits for foo in this list.

That should significantly reduce the number of checks you have to make, and it will change from a regex test to a simple property lookup. I'm guessing it would be a substantial boost to performance.

Scott Sauyet
  • 49,207
  • 4
  • 49
  • 103
  • Creating a new list for every possible combination of foobar and then checking it against the dictionary would be faster? Wouldn't that create 30.9 million lookups to just create the first level? – Nate Sep 11 '15 at 19:38
  • 1
    No, 26 * (length of word). 26 * 6 for `foobar`. It's not 26 ^ 6, since we're only changing one letter at a time. That's 156 look-ups instead of `(dictionary.length)` regex tests. – Scott Sauyet Sep 11 '15 at 19:57
  • It certainly might be worth testing. Your "boggle -> shaggy" looks like a good test candidate for comparison. – Scott Sauyet Sep 11 '15 at 19:59
  • This was orders of magnitude faster. Awesome! – Nate Sep 11 '15 at 20:35
  • @Nate I did say this in my comments, though Scott has explained better ;). Another possible optimization would be to use heuristics. You can calculate all hamming distances of your words with the final word, and when you move a level, first process the ones that are "closer" to your end word. This might work better for cases where you have a valid path – juvian Sep 11 '15 at 20:50
  • @juvian thanks, I guess I needed an explanation in simpler terms. I'm fairly new to the language of discussing algorithms so 'heuristics' and 'hamming' are terms I have to look up and wrap my head around. – Nate Sep 11 '15 at 20:53
  • @Nate are you looking for the shortest path or any path? – juvian Sep 11 '15 at 21:01
  • @juvian shortest path – Nate Sep 11 '15 at 21:12
  • @Nate: Glad it worked. I expected it, but as always, it really helps to actually try these things. :-) – Scott Sauyet Sep 12 '15 at 15:46
  • @Nate, the suggestion from juvian for using Hamming distances is a really good one, if the goal is to most quickly find any solution. Since your goal is to find the shortest path, there is probably nothing algorithmically better than what you're doing, although there might well be optimizations available to your particular implementation. The most likely one I see is the suggestion to try from both ends to meet in the middle. – Scott Sauyet Sep 12 '15 at 21:00