2

I try to find the longest anagram in Javascript. For this, I have an array with 10 letters and a dictionary that contains every words.

I would like that the program test every combination possible.

We started from 10 (the array length of letters) and we check if it's an anagram If not, we remove the char at the very end, and we check, if not, we shift the removed char by one to the left... When the entire combinations with 9 letters is tested, we test for 8, 7, 6, 5, 4, 3, 2 letters.

var wordFound = '' // The longest word found
var copyArr = [] // I don't manipulate the lettersChosen array, so I save a copy in copyArr
var savedWord = [] // A copy of copyArr but i'm not sure about this
var lengthLetters = 0 // The length of the numbers left
var lettersChosen = ['A', 'S', 'V', 'T', 'S', 'E', 'A', 'M', 'N'] //This the the array of letters

function isAnagram(stringA, stringB) {
    stringA = stringA.toLowerCase().replace(/[\W_]+/g, "");
    stringB = stringB.toLowerCase().replace(/[\W_]+/g, "");
 
    const stringASorted = stringA.split("").sort().join("");
    const stringBSorted = stringB.split("").sort().join("");

    return stringASorted === stringBSorted;
}

function checkForEachWord(arr) {
    strLetters = ''

    for (i in arr)
        strLetters = strLetters + arr[i]
    for (var i in file) 
        if (isAnagram(strLetters, file[i])) {
            wordFound = file[i]
            return true
        }
    return false
}

function getOneOfTheLongestWord() {
    lettersChosen.forEach(letter => {
        copyArr.push(letter) // I copy the array
    })
    var index = 1 // The index of the letter to remove
    var countLetter = 1 // How much letters I have to remove
    var position = copyArr.length - index // The actual position to remove
    var savedArray = [] // The copy of CopyArr but i'm not sure about that
    var iteration = 0 // The total of combination possible
    var test = checkForEachWord(copyArr) // I try with 10 letters
    if (test == true)
        return true // I found the longest word

    while (test == false) {
        copyArr.splice(position, 1) // I remove the char at current position
        index++ // Change letter to remove
        if (index > copyArr.length + 1) { // If I hit the first character, then restart from the end
            index = 1
            countLetter++ // Remove one more letter
        }
        console.log(copyArr + ' | ' + position)
        position = copyArr.length - index // Get the position based on the actual size of the array letters
        test = checkForEachWord(copyArr) // Test the anagram
        copyArr = [] // Reset array
        lettersChosen.forEach(letter => { // Recreate the array
            copyArr.push(letter)
        })
    }
    
    return true // Word found
}

getOneOfTheLongestWord()

My code is not optimal there is so many way to improve it.

Actually my output is good with 9 letters.

copyArr | position
A,S,V,T,S,E,A,M | 8
A,S,V,T,S,E,M,N | 6
A,S,V,T,S,A,M,N | 5
A,S,V,T,E,A,M,N | 4
A,S,V,S,E,A,M,N | 3
A,S,T,S,E,A,M,N | 2
A,V,T,S,E,A,M,N | 1
S,V,T,S,E,A,M,N | 0

But not with 8 letters, I don't see how I can use my countLetter to test all combinations...

Thank you very much.

Cobra braisé
  • 109
  • 2
  • 9
  • If you have a dictionary with words where the letters are sorted, then it is very easy to find anagrams. E.g. dictionary {'ACER': ['ACRE', 'CARE', 'RACE']} and input 'CREA', just sort input => 'ACER' and look up in dictionary. (btw combinaison = combination) – maraca Feb 04 '21 at 14:47
  • I would like to find the longest anagram, so I have to test all combinations while I found one. I don't really understand what you propose – Cobra braisé Feb 04 '21 at 15:13
  • Can anagrams be made out of any subset of the letters? – btilly Feb 04 '21 at 18:41
  • 1
    Maybe you can give some concrete examples - such as inputs and expected outputs to demo what is the problem or `area` that you would like to `improve`... – Daniel Hao Feb 04 '21 at 19:46
  • Are you only looking for single word anagrams such as `NIGHT` / `THING`? Or are you going for multiword ones, such as `GEORGE BUSH` / `HE BUGS GORE`? – Scott Sauyet Feb 04 '21 at 19:48
  • @DanielHao I would like to have all the combinations to find which word is the biggest anagram compared to letters given. A simplified example : ['A', 'S', 'V', 'T'] We have : ASV AST AVT SVT (3 letters) AS AV AT SV ST VT (2 letters) – Cobra braisé Feb 04 '21 at 20:19
  • @ScottSauyet Only words valid on scrabble, so simple word. – Cobra braisé Feb 04 '21 at 20:19
  • @btilly We can only use one time the letters given and we have to find the biggest word. – Cobra braisé Feb 04 '21 at 20:24

1 Answers1

3

Short answer, put the sorted versions of dictionary words into a trie, then do an A* search.

Longer answer because you probably haven't encountered those things.

A trie is a data structure which at each point gives you a lookup by character of the next level of the trie. You can just use a blank object as a trie. Here is some simple code to add a word to one.

function add_to_trie (trie, word) {
    let letters = word.split('').sort();
    for (let i in letters) {
        let letter = letters[i];
        if (! trie[letter]) {
            trie[letter] = {};
        }
        trie = trie[letter];
    }
    trie['final'] = word;
}

An A* search simply means that we have a priority queue that gives us the best option to look at next. Rather than implement my own priority queue I will simply use an existing one at flatqueue. It returns the lowest priority possible. So I'll use as a priority one that puts the longest possible word first, and if there is a tie then goes with whatever word we are farthest along on. Here is an implementation.

import FlatQueue from "flatqueue";

function longest_word_from (trie, letters) {
    let sorted_letters = letters.sort();
    let queue = new FlatQueue();
    // Entries will be [position, current_length, this_trie]
    // We prioritize the longest word length first, then the
    // number of characters.  Since we get the minimum first,
    // we make priorities negative numbers.
    queue.push([0, 0, trie], - (letters.length ** 2));

    while (0 < queue.length) {
        let entry = queue.pop();

        let position = entry[0];
        let word_length = entry[1];
        let this_trie = entry[2];

        if (position == letters.length) {
            if ('final' in this_trie) {
                return this_trie['final'];
            }
        }
        else {
            if (letters[position] in this_trie) {
                queue.push(
                    [
                        position + 1, // Advance the position
                        word_length + 1, // We added a letter
                        this_trie[letters[position]] // And the sub-trie after that letter
                    ],
                    - letters.length * (
                        letters.length + position - word_length
                      ) - word_length - 1
                );
            }

            queue.push(
                [
                    position + 1, // Advance the position
                    word_length,  // We didn't add a a letter
                    this_trie // And stayed at the same position.
                ],

                - letters.length * (
                    letters.length + position - word_length - 1
                  ) - word_length
            );
        }
    }
    return null;
}

If the import doesn't work for you, you can simply replace that line with the code from index.js. Simply remove the leading export default and the rest will work.

And with that, here is sample code that demonstrates it in action.

let file = ['foo', 'bar', 'baz', 'floop'];
let letters = 'fleaopo'.split('')

let this_trie = {};
for (var i in file) {
    add_to_trie(this_trie, file[i]);
}

console.log(longest_word_from(this_trie, letters));

If you have a long dictionary, loading the dictionary into the trie is most of your time. But once you've done that you can call it over and over again with different letters, and get answers quite quickly.

btilly
  • 43,296
  • 3
  • 59
  • 88
  • Thank you very much for your answer. It works very well, I found yesterday another algorithm but it was too long (because the dictionary file is really big), yours is faster. – Cobra braisé Feb 08 '21 at 14:49
  • 1
    @Cobrabraisé If you need it to be even faster, you can sort letters by frequency instead of by alphabetical order. That was one of the tricks that I used at https://stackoverflow.com/a/64309012/585411. – btilly Feb 08 '21 at 16:09
  • I tested your code with my dictionary and it was fast enough. But thank you for sharing this, I will keep that under my arm if I need it later – Cobra braisé Feb 08 '21 at 19:34