0

I have tried to this answer to find all permutations of size K=6 of an array of strings, but the array I'm permuting is way too large (~13,000 elements but I can guarantee most of those will be duplicates), which means I'm getting:

....
re-permuting with 6925/12972
node:internal/console/constructor:257
   value: function(streamSymbol, string) {
                   ^
RangeError: Maximum call stack size exceeded  
    at console.value (node:internal/console/constructor:257:20)  
    at console.log (node:internal/console/constructor:359:26)  
    at permute (/home/path/to/code/permutation.js:22:17)  
    at permute (/home/path/to/code/permutation.js:23:9)  
    .....
re-permuting with 6924/12972
....
re-permuting with 6918/12972

And then it dies. I guessed that is the recursion that's the problem.

I know that there's at most ~300 unique elements in my input (which is how I know that many of the array elements must be duplicates), but I don't know if that's 10,000 instances of one element, and then the rest are individually unique elements or at least K of each unique element. Because of that I can't just fed them into a Set, and there maybe fewer than K of one element so I can't just create a new input by duplicating the new ones K times.

Here is my slightly modified (for readability and logging only) version of the code from the above linked answer (on second look, this algorithm is far from optimal):

let permArr = [];
let usedChars = [];
let originalLength;
/**
 * Subsets of permutations of an array
 * @param {any[]} input array of elements to permute
 * @param {number} subsetLength size of the subsets to return
 * @returns {Array[]} and array of arrays
 */
function permute(input, subsetLength = input.length) {
    let index
    let ch;
    originalLength ??= input.length;
    for (index = 0; index < input.length; index++) {
        ch = input.splice(index, 1)[0];
        usedChars.push(ch);
        if (input.length == 0) {
            let toAdd = usedChars.slice(0, subsetLength);
            // resizing the returned array to size k
            if (!permArr.includes(toAdd)) permArr.push(toAdd);
        }
        console.log(`re-permuting with ${input.length}/${originalLength}`)
        permute(input, subsetLength);
        input.splice(index, 0, ch);
        usedChars.pop();
    }
    return permArr
};

And I found this answer but I do not follow it at all, and this other answer is similar but still uses recursion.

How can I make this without recursion/more performant so it can handle much larger arrays? I'm using NodeJs, and I'm not averse to using different data types.

AncientSwordRage
  • 7,086
  • 19
  • 90
  • 173
  • 1
    You're on the wrong path. Whatever you think you can accomplish by finding all permutations of 13000 elements, you can't. You need to find another way. https://www.wolframalpha.com/input/?i=13000%21 – Matt Timmermans Jan 22 '22 at 12:32
  • @matt it's more like https://www.wolframalpha.com/input/?i=13000%21%2F%28%2813000-6%29%21%29 but your point stands. My array will have some duplication, so I may be able to optimise or use an ArrayBuffer – AncientSwordRage Jan 22 '22 at 12:46
  • 1
    "Find all the permutations and then use the first k elements of each permutation" is so obviously inefficient that it's astounding that it gathers attention other than on the Daily WTF. However, as noted above, even if you did this efficiently, looping through all 6-permutations of an array of 13000 elements is not going to complete within your lifetime, unless there are a *lot* of duplicates. And if there are duplicates, you need an algorithm which takes duplicates into account. But "some duplication" doesn't sound like it's good enough to make this practical. – rici Jan 22 '22 at 16:03
  • @rici on second look, this algorithm is awful. Should be easy to improve on then? – AncientSwordRage Jan 22 '22 at 16:18
  • There are other answers to the same question :-) But none deal with duplicates. – rici Jan 22 '22 at 16:34
  • @rici I saw it, but the `document.write` really threw me off. I am fairly confident I can deduplicate my list to ~300 items without effecting the validity of the permutations. – AncientSwordRage Jan 22 '22 at 16:38
  • Ahhh I just realized I can't completely deduplicate the set – AncientSwordRage Jan 22 '22 at 16:45
  • 2
    I guess you need to specify your problem more precisely, if you want a precise answer :-) – rici Jan 22 '22 at 16:50
  • @rici great idea – AncientSwordRage Jan 22 '22 at 16:55
  • 1
    Meanwhile, I added [an answer](https://stackoverflow.com/a/70815230/1566221) to the linked question, which does correctly handle duplicates with reasonable efficiency. (Although there are algorithms which are more efficient in the case that there are a huge number of duplicates.) The maximum recursion depth in my answer (like several of the others) is the subset length, not the data length. So the *recursion depth* is quite practical for permutations of length 6. – rici Jan 22 '22 at 17:22

1 Answers1

1

I don't know if that's 10,000 instances of one element, and then the rest are individually unique elements or at least K of each unique element. Because of that I can't just fed them into a Set, and there maybe fewer than K of one element so I can't just create a new input by duplicating the new ones K times

So just group and count them. Seems simple enough:

function subsetPermutations(arr, size) {
    const counts = {};
    for (const el of arr) {
        counts[el] = (counts[el] ?? 0) + 1;
    }
    const unique = Object.keys(counts);
    const result = Array.from({length: size});
    function* recurse(depth) {
        if (depth == size) {
            yield result;
        } else {
            for (const el of unique) {
                if (counts[el]) {
                    result[depth] = el;
                    counts[el]--;
                    yield* recurse(depth+1)
                    counts[el]++;
                }
            }
        }
    }
    return recurse(0);
}

for (const perm of subsetPermutations(["a", "b", "b", "c", "c", "c"], 3)) {
    console.log(perm.join('-'));
}

I have tried to this answer to find all permutations of size K=6, but the array I'm permuting is way too large (~13,000 elements), however I can guarantee I know that there's at most ~300 unique

That's still roughtly 3006 permutations, which is far too many to put them into an array. The function above is designed as an iterator so that you can work on the current result in a loop before it gets mutated in the next iteration, to avoid any allocation overhead, but it still will take too long to generate all of them.

How can I make this without recursion/more performant so it can handle much larger arrays? I'm using NodeJs, and I'm not averse to using different data types.

You can use a Map instead of the object for counts, but I doubt it'll be much faster for just 300 different elements.

Avoiding recursion is unnecessary since it only goes 6 level deep, there won't be a stack overflow unlike with your inefficient solution. But for performance, you might still try this approach of dynamically generating nested loops.

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
  • I think you meant "but it *will* still take too long." – rici Jan 25 '22 at 03:43
  • @rici indeed. thanks! – Bergi Jan 25 '22 at 09:24
  • would making the counts part like `counts[el] = Math.min((counts[el] ?? 0) + 1, size);` be a possible optimization? – AncientSwordRage Jan 30 '22 at 13:19
  • 1
    @AncientSwordRage Not really, no - even if `counts[el] > size`, the recursion never goes deeper than `size`. What you possibly could do is split the `unique` elements into those with `counts[el] >= size` and those with `counts[el] < size`, then skip the count checking and counting up and down for those where you know that it won't ever reach `0`: `… else { for (const el of uniqueLimited) if (counts[el]) { counts[el]--; …recurse(…)…; counts[el]++; } for (const el of uniqueUnlimited) { …recurse(…)…; } }`. Notice that will mess with the order of results though. – Bergi Jan 30 '22 at 15:27
  • 1
    @AncientSwordRage An optimisation that actually might be worth doing (for inputs where the number of unique elements is close to the output `size`) is not to do the `if (counts[el])` check at every depth, and instead just remove it from the `unique` set when reaching `0` and add it back before incrementing. However, mutating `unique` while iterating it (multiple times even) is complicated, to get a theoretical complexity improvement you'd need to implement your own skip list that keeps the order and iteration position in face of removal/reinsertion - I doubt there's a speedup in practice. – Bergi Jan 30 '22 at 15:39
  • Would making `unique` a map with efficient adding/setting help with removing some keys from it; the values could be the counts? – AncientSwordRage Jan 31 '22 at 22:37
  • @AncientSwordRage You mean making it a `Map`? No, that won't work, as reinserting a key (after you're done with the recursive call) would cause it to be iterated twice (and repeatedly forever) in the `for … of` loop, which you absolutely need to avoid. Really, for `size = 6` there's not much to be gained. – Bergi Jan 31 '22 at 23:02
  • I'll stop trying to squeeze more out of it then thanks again for this, I've definitely learnt a thing or two from it. – AncientSwordRage Jan 31 '22 at 23:05