62

Do you know some neat Java libaries that allow you to make cartesian product of two (or more) sets?

For example: I have three sets. One with objects of class Person, second with objects of class Gift and third with objects of class GiftExtension.

I want to generate one set containing all possible triples Person-Gift-GiftExtension.

The number of sets might vary so I cannot do this in nested foreach loop. Under some conditions my application needs to make a product of Person-Gift pair, sometimes it is triple Person-Gift-GiftExtension, sometimes there might even be sets Person-Gift-GiftExtension-GiftSecondExtension-GiftThirdExtension, etc.

mgamer
  • 13,580
  • 25
  • 87
  • 145

11 Answers11

47

Edit: Previous solutions for two sets removed. See edit history for details.

Here is a way to do it recursively for an arbitrary number of sets:

public static Set<Set<Object>> cartesianProduct(Set<?>... sets) {
    if (sets.length < 2)
        throw new IllegalArgumentException(
                "Can't have a product of fewer than two sets (got " +
                sets.length + ")");

    return _cartesianProduct(0, sets);
}

private static Set<Set<Object>> _cartesianProduct(int index, Set<?>... sets) {
    Set<Set<Object>> ret = new HashSet<Set<Object>>();
    if (index == sets.length) {
        ret.add(new HashSet<Object>());
    } else {
        for (Object obj : sets[index]) {
            for (Set<Object> set : _cartesianProduct(index+1, sets)) {
                set.add(obj);
                ret.add(set);
            }
        }
    }
    return ret;
}

Note that it is impossible to keep any generic type information with the returned sets. If you knew in advance how many sets you wanted to take the product of, you could define a generic tuple to hold that many elements (for instance Triple<A, B, C>), but there is no way to have an arbitrary number of generic parameters in Java.

Andrea Ligios
  • 49,480
  • 26
  • 114
  • 243
Michael Myers
  • 188,989
  • 46
  • 291
  • 292
  • I think this is a really great way to deal with Pairs. If he does not know if he might need pairs, triples, quadrupels... then it's not directly fitting, but he could use Pair,GiftExtension> i think. – Lena Schimmel Apr 03 '09 at 15:03
  • 1
    The return type should be `Set>`, otherwise you might get sets of different size in the result (because duplicates). – Marco Jul 30 '19 at 11:31
  • If I change argument of ```cartesianProduct``` to ArrayList, the cartesian products is returned by reversed order. I mean, first element of cartesian product will be element of last set given. Why is that? – foobar Apr 13 '20 at 08:25
  • How to change the arguments to ```ArrayList>``` and return the method in ```ArrayList>``` data type? When I do modifications, the ordering of cartesian product changes. – foobar Apr 13 '20 at 09:01
  • 1
    @MuhammadAshfaq Probably the simplest way is to reorder the output afterwards. Sets don't have ordering, so my method doesn't concern itself at all with order. – Michael Myers Apr 13 '20 at 14:48
  • I get this error when number of pairs are equal to 3147776. Python successfully creates cartesian product but the code you provided, gives this error: ```Exception in thread "main" java.lang.OutOfMemoryError: Java heap space at java.util.HashMap$KeySet.iterator(HashMap.java:917) at java.util.HashSet.iterator(HashSet.java:173) at twelveBead.TwelveBeadScalednD._cartesianProduct(TwelveBeadScalednD.java:62)``` Even if I assign heap space to 10 gb, same error persists. It means there is some memory leak in this code. – foobar Apr 14 '20 at 04:33
  • `_cartesianProduct(0, new HashSet() {{ add(new HashSet() {{ add(1); add(2); }}); add(new HashSet() {{ add('a'); add('b'); }});}})` gives output of `[[[1, 2]], [[a, b]]]` which looks like not what the OP asked.... – dbl Nov 15 '20 at 20:20
29

This is a pretty old question, but why not use Guava's cartesianProduct?

Luan Nico
  • 5,376
  • 2
  • 30
  • 60
Martin Andersson
  • 1,801
  • 4
  • 21
  • 25
24

The method below creates the cartesian product of a list of list of strings:

protected <T> List<List<T>> cartesianProduct(List<List<T>> lists) {
    List<List<T>> resultLists = new ArrayList<List<T>>();
    if (lists.size() == 0) {
        resultLists.add(new ArrayList<T>());
        return resultLists;
    } else {
        List<T> firstList = lists.get(0);
        List<List<T>> remainingLists = cartesianProduct(lists.subList(1, lists.size()));
        for (T condition : firstList) {
            for (List<T> remainingList : remainingLists) {
                ArrayList<T> resultList = new ArrayList<T>();
                resultList.add(condition);
                resultList.addAll(remainingList);
                resultLists.add(resultList);
            }
        }
    }
    return resultLists;
}

Example:

System.out.println(cartesianProduct(Arrays.asList(Arrays.asList("Apple", "Banana"), Arrays.asList("Red", "Green", "Blue"))));

would yield this:

[[Apple, Red], [Apple, Green], [Apple, Blue], [Banana, Red], [Banana, Green], [Banana, Blue]]
imaurer
  • 83
  • 1
  • 4
Philipp Meister
  • 241
  • 2
  • 2
12

The number of sets might vary so I cannot do this in nested foreach loop.

Two hints:

  • A x B x C = A x (B x C)
  • Recursion
Michael Borgwardt
  • 342,105
  • 78
  • 482
  • 720
  • 1
    Hints may be good as a comment, but are not good as an answer. They take up a lot of space that can be used for answers. – Kröw Feb 21 '23 at 01:06
11

Index-based solution

Working with the indices is an alternative that is fast and memory-efficient and can handle any number of sets. Implementing Iterable allows easy use in a for-each loop. See the #main method for a usage example.

public class CartesianProduct implements Iterable<int[]>, Iterator<int[]> {

    private final int[] _lengths;
    private final int[] _indices;
    private boolean _hasNext = true;

    public CartesianProduct(int[] lengths) {
        _lengths = lengths;
        _indices = new int[lengths.length];
    }

    public boolean hasNext() {
        return _hasNext;
    }

    public int[] next() {
        int[] result = Arrays.copyOf(_indices, _indices.length);
        for (int i = _indices.length - 1; i >= 0; i--) {
            if (_indices[i] == _lengths[i] - 1) {
                _indices[i] = 0;
                if (i == 0) {
                    _hasNext = false;
                }
            } else {
                _indices[i]++;
                break;
            }
        }
        return result;
    }

    public Iterator<int[]> iterator() {
        return this;
    }

    public void remove() {
        throw new UnsupportedOperationException();
    }

    /**
     * Usage example. Prints out
     * 
     * <pre>
     * [0, 0, 0] a, NANOSECONDS, 1
     * [0, 0, 1] a, NANOSECONDS, 2
     * [0, 0, 2] a, NANOSECONDS, 3
     * [0, 0, 3] a, NANOSECONDS, 4
     * [0, 1, 0] a, MICROSECONDS, 1
     * [0, 1, 1] a, MICROSECONDS, 2
     * [0, 1, 2] a, MICROSECONDS, 3
     * [0, 1, 3] a, MICROSECONDS, 4
     * [0, 2, 0] a, MILLISECONDS, 1
     * [0, 2, 1] a, MILLISECONDS, 2
     * [0, 2, 2] a, MILLISECONDS, 3
     * [0, 2, 3] a, MILLISECONDS, 4
     * [0, 3, 0] a, SECONDS, 1
     * [0, 3, 1] a, SECONDS, 2
     * [0, 3, 2] a, SECONDS, 3
     * [0, 3, 3] a, SECONDS, 4
     * [0, 4, 0] a, MINUTES, 1
     * [0, 4, 1] a, MINUTES, 2
     * ...
     * </pre>
     */
    public static void main(String[] args) {
        String[] list1 = { "a", "b", "c", };
        TimeUnit[] list2 = TimeUnit.values();
        int[] list3 = new int[] { 1, 2, 3, 4 };

        int[] lengths = new int[] { list1.length, list2.length, list3.length };
        for (int[] indices : new CartesianProduct(lengths)) {
            System.out.println(Arrays.toString(indices) //
                    + " " + list1[indices[0]] //
                    + ", " + list2[indices[1]] //
                    + ", " + list3[indices[2]]);
        }
    }
}
Manos Nikolaidis
  • 21,608
  • 12
  • 74
  • 82
Remko Popma
  • 121
  • 1
  • 3
3

Here is an Iterable, which allows you to use a simplified for-loop:

import java.util.*;

// let's begin with the demo. Instead of Person and Gift, 
// I use the well known char and int. 
class CartesianIteratorTest {

    public static void main (String[] args) {
        List <Object> lc = Arrays.asList (new Object [] {'A', 'B', 'C', 'D'});
        List <Object> lC = Arrays.asList (new Object [] {'a', 'b', 'c'});   
        List <Object> li = Arrays.asList (new Object [] {1, 2, 3, 4});
            // sometimes, a generic solution like List <List <String>>
            // might be possible to use - typically, a mixture of types is 
            // the common nominator 
        List <List <Object>> llo = new ArrayList <List <Object>> ();
        llo.add (lc);
        llo.add (lC);
        llo.add (li);

        // Preparing the List of Lists is some work, but then ...    
        CartesianIterable <Object> ci = new CartesianIterable <Object> (llo);

        for (List <Object> lo: ci)
            show (lo);
    }

    public static void show (List <Object> lo) {
        System.out.print ("(");
        for (Object o: lo)
            System.out.print (o + ", ");
        System.out.println (")");
    }
}

How is it done? We need an Iterable, to use the simplified for-loop, and an Iterator has to be returned from the Iterable. We return a List of Objects - this could be a Set instead of List, but Set has no indexed access, so it would be a bit more complicated, to implement it with Set instead of List. Instead of a generic solution, Object would have been fine for many purposes, but generics allow for more restrictions.

class CartesianIterator <T> implements Iterator <List <T>> {

    private final List <List <T>> lilio;    
    private int current = 0;
    private final long last;

    public CartesianIterator (final List <List <T>> llo) {
        lilio = llo;
        long product = 1L;
        for (List <T> lio: lilio)
            product *= lio.size ();
        last = product;
    } 

    public boolean hasNext () {
        return current != last;
    }

    public List <T> next () {
        ++current;
        return get (current - 1, lilio);
    }

    public void remove () {
        ++current;
    }

    private List<T> get (final int n, final List <List <T>> lili) {
        switch (lili.size ())
        {
            case 0: return new ArrayList <T> (); // no break past return;
            default: {
                List <T> inner = lili.get (0);
                List <T> lo = new ArrayList <T> ();
                lo.add (inner.get (n % inner.size ()));
                lo.addAll (get (n / inner.size (), lili.subList (1, lili.size ())));
                return lo;
            }
        }
    }
}

The mathematical work is done in the 'get'-method. Think about 2 sets of 10 elements. You have a total of 100 combinations, enumerated from 00, 01, 02, ... 10, ... to 99. For 5 X 10 elements 50, for 2 X 3 elements 6 combinations. The modulo of the sublist size helps to pick one element for each iteration.

Iterable i the least interesting thing here:

class CartesianIterable <T> implements Iterable <List <T>> {

    private List <List <T>> lilio;  

    public CartesianIterable (List <List <T>> llo) {
        lilio = llo;
    }

    public Iterator <List <T>> iterator () {
        return new CartesianIterator <T> (lilio);
    }
}

To implement Iterable, which allows the for-each kind of loop, we have to implement iterator (), and for Iterator we have to implement hasNext (), next () and remove ().

Result:

(A, a, 1, )
(B, a, 1, )
(C, a, 1, )
(D, a, 1, )
(A, b, 1, )
(B, b, 1, )
(C, b, 1, )
(D, b, 1, )
...
(A, a, 2, )
...
(C, c, 4, )
(D, c, 4, )
user unknown
  • 35,537
  • 11
  • 75
  • 121
3

Here is an Iterator that gives the cartesian product of a two-dimensional array, where the arrays components represent the sets from the question (one can always convert actual Sets to arrays):

public class CartesianIterator<T> implements Iterator<T[]> {
    private final T[][] sets;
    private final IntFunction<T[]> arrayConstructor;

    private int count = 0;
    private T[] next = null;

    public CartesianIterator(T[][] sets, IntFunction<T[]> arrayConstructor) {
        Objects.requireNonNull(sets);
        Objects.requireNonNull(arrayConstructor);

        this.sets = copySets(sets);
        this.arrayConstructor = arrayConstructor;
    }

    private static <T> T[][] copySets(T[][] sets) {
        // If any of the arrays are empty, then the entire iterator is empty.
        // This prevents division by zero in `hasNext`.
        for (T[] set : sets) {
            if (set.length == 0) {
                return Arrays.copyOf(sets, 0);
            }
        }
        return sets.clone();
    }

    @Override
    public boolean hasNext() {
        if (next != null) {
            return true;
        }

        int tmp = count;
        T[] value = arrayConstructor.apply(sets.length);
        for (int i = 0; i < value.length; i++) {
            T[] set = sets[i];

            int radix = set.length;
            int index = tmp % radix;

            value[i] = set[index];

            tmp /= radix;
        }

        if (tmp != 0) {
            // Overflow.
            return false;
        }

        next = value;
        count++;

        return true;
    }

    @Override
    public T[] next() {
        if (!hasNext()) {
            throw new NoSuchElementException();
        }

        T[] tmp = next;
        next = null;
        return tmp;
    }
}

The basic idea is to treat count as a multi-radix number (digit i has its own radix which equals the length of the i'th "set"). Whenever we have to resolve next (that is, when hasNext() is called and next is null), we decompose the number into its digits in this multi-radix. These digits are now used as the indices from which we draw elements from the different sets.

Example use:

String[] a = { "a", "b", "c"};
String[] b = { "X" };
String[] c = { "r", "s" };

String[][] abc = { a, b, c };

Iterable<String[]> it = () -> new CartesianIterator<>(abc, String[]::new);
for (String[] s : it) {
    System.out.println(Arrays.toString(s));
}

Output:

[a, X, r]
[b, X, r]
[c, X, r]
[a, X, s]
[b, X, s]
[c, X, s]

If one doesn't like arrays, the code is trivially convertible into using collections.

I guess this is more or less similar to the answer given by "user unknown", only without recursion and collections.

bisgardo
  • 4,130
  • 4
  • 29
  • 38
0

You can get the Cartesian product of an arbitrary number of sets of different types and store it into a set of sets of objects Set<Set<Object>> using Java 9 Streams as follows:

Try it online!

public static Set<Set<Object>> cartesianProduct(Set<?>... sets) {
    // incorrect incoming data
    if (sets == null) return Collections.emptySet();
    return Arrays.stream(sets)
            // non-null and non-empty sets
            .filter(set -> set != null && set.size() > 0)
            // represent each set element as Set<Object>
            .map(set -> set.stream().map(Set::<Object>of)
                    // Stream<Set<Set<Object>>>
                    .collect(Collectors.toSet()))
            // summation of pairs of inner sets
            .reduce((set1, set2) -> set1.stream()
                    // combinations of inner sets
                    .flatMap(inner1 -> set2.stream()
                            // merge two inner sets into one
                            .map(inner2 -> Stream.of(inner1, inner2)
                                    .flatMap(Set::stream)
                                    .collect(Collectors.toCollection(
                                            LinkedHashSet::new))))
                    // set of combinations
                    .collect(Collectors.toCollection(LinkedHashSet::new)))
            // returns Set<Set<Object>>, otherwise an empty set
            .orElse(Collections.emptySet());
}
public static void main(String[] args) {
    Set<Integer> set1 = Set.of(1, 2, 3);
    Set<String> set2 = Set.of("A", "B", "C");
    Set<Object> set3 = Set.of(new Time(0));

    Set<Set<Object>> sets = cartesianProduct(set1, set2, set3);
    // output
    sets.forEach(System.out::println);
}

Output:

[1, A, 03:00:00]
[1, B, 03:00:00]
[1, C, 03:00:00]
[2, A, 03:00:00]
[2, B, 03:00:00]
[2, C, 03:00:00]
[3, A, 03:00:00]
[3, B, 03:00:00]
[3, C, 03:00:00]

See also: How to create a data structure similar to the cartesian product of three lists of different types?

0

a simple solution, for example, for Integer set should be as follows:

void printCombination(List<Set<Integer>> listSet, Set<Integer> combination) {
    if (listSet.isEmpty()) {
        System.out.println("a combination :" + combination);

        return;
    }

    Set<Integer> intSet = listSet.get(0);
    for (Integer it : intSet) {
        Set<Integer> combination1 = new HashSet<Integer>();
        combination1.addAll(combination);
        combination1.add(it);

        List<Set<Integer>> listSet1 = new ArrayList<Set<Integer>>();
        listSet1.addAll(listSet);
        listSet1.remove(0);
        this.printCombination(listSet1, combination1);
    }

} 
Anh Pham
  • 13
  • 3
0

Yes, there is Functional Java.

For a set s:

s.bind(P.p2(), s);
Ole Pannier
  • 3,208
  • 9
  • 22
  • 33
  • note that fj.data.Set does not have a bind method, but it does have toStream() and iterableSet(Iterable) to convert to/from fj.data.Stream which does have a bind method. – Apocalisp Apr 03 '09 at 22:50
-1

You can apply a simple generator interferace named Seq via this library. Unlike Guava's cartesianProduct, the sets/lists/iterables do not need to be within the same generic type bound B, all types are welcome. And all you need to do is writing the product as normal nested for-loops.

public static void main(String[] args) {
    List<String> ls1 = Arrays.asList("a", "b");
    List<Integer> ls2 = Arrays.asList(1, 2);
    List<Character> ls3 = Arrays.asList('x', 'y');

    Seq<String> seq = c -> {
        for (String s : ls1) {
            for (Integer i : ls2) {
                for (Character d : ls3) {
                    c.accept(s + i + d);
                }
            }
        }
    };

    System.out.println(seq.toSet());
}

The result would be

[a2x, a1y, b1x, b2y, a1x, a2y, b2x, b1y]
wolray
  • 24
  • 3