Introduction
What you want to do is called: cartesian product
Let's do some naming before going further. I will name your input lists L_i
where 1 <= i <= n
. I will also name S_i
the size of the input list L_i
.
We could ask the question: what is the size of the output ?
If there is only one list L_1
, there will be S_1
output lists, each one containing exactly one element of L_1
.
If there are two lists {L_1, L_2}
. For each element of L_1
I can append S_2
different elements of L_2
. As there are S_1
elements of L_1
it makes S_1*S_2
different output lists.
We can continue the reasoning to n
lists and prove that the number of output lists will be: S_1*...*S_n
.
How does that help us ? Because we can now create a mapping between a number i
and an output list.
Given i
a number 0<=i<S_1*...*S_n
, the output list contains:
element of L_1 at index i/(S_2*S_3*...*S_n) MOD S_1
element of L_2 at index i/( S_3*...*S_n) MOD S_2
...
element of L_n at index i MOD S_n
Implementation example
I don't know VB.net so I chose C# which uses the same .net platform.
I decided to use a yield return
function so that we don't allocate more memory than needed. If you just need to print the outputs it will only consume a single ulong
of memory instead of allocating a very big list of output lists.
using System;
using System.Collections.Generic;
using System.Linq;
namespace cartesian_product
{
class Program
{
public static IEnumerable<List<T>> cartesian_product<T>(IEnumerable<List<T>> lists)
{
ulong MAX_I = lists.Select(l => (ulong)l.Count)
.Aggregate(1ul, (a, b) => a * b);
for (ulong i = 0; i < MAX_I; ++i)
{
var output = new List<T>();
ulong div = MAX_I;
ulong index, s_i;
foreach (var l in lists)
{
s_i = (ulong)l.Count;
div /= s_i;
index = (i/div)%s_i;
output.Add(l[(int)index]);
}
yield return output;
}
}
static void Main(string[] args)
{
var first = new List<Char> {'a', 'b', 'c', 'd'};
var second = new List<Char> {'e'};
var third = new List<Char> {'f', 'g', 'h', 'i', 'j'};
Console.WriteLine("{");
foreach (var output in cartesian_product(new[]{first, second, third}))
{
Console.WriteLine("{{{0}}}", string.Join(",", output));
}
Console.WriteLine("}");
}
}
}
The output is:
{
{a,e,f}
{a,e,g}
{a,e,h}
{a,e,i}
{a,e,j}
{b,e,f}
{b,e,g}
{b,e,h}
{b,e,i}
{b,e,j}
{c,e,f}
{c,e,g}
{c,e,h}
{c,e,i}
{c,e,j}
{d,e,f}
{d,e,g}
{d,e,h}
{d,e,i}
{d,e,j}
}
Limitation
One may ask : what if the product of the lists length overflows the variable used to index the outputs ?
.
This is a real theoretical problem, but I use a ulong
in my code and if the total number of ouput lists overflows this variable there is little chance that you can enumerate the output whatever method you use. (because the theoretical output will contain more than 2^64
lists).
Applications
The OP didn't explain why he needed this algorithm in the first place. So the reader may wonder why is this useful ?
. One reason among others may be to generate test cases for regression testing. Let's say you have a legacy function taking as input three variables. You could generate some possible values for each of the parameters and using the cartesian product collect result of the function for each possible set of parameters. After refactoring the legacy code, you could ensure there is no regression by comparing the new code output and the legacy code output.