Idiomatic way to create n-ary cartesian product (combinations of several sets of parameters)

Question

To create all possible combinations of two sets of parameters and perform an action on them, you can do:

setOf(foo, bar, baz).forEach { a ->
    setOf(0, 1).forEach { b ->
        /* use a and b */
    }
}

However, if you have (potentially many) more parameters, this quickly turns into a pyramid of doom:

setOf(foo, bar, baz).forEach { a ->
    setOf(0, 1).forEach { b ->
        setOf(true, false, null).forEach { c ->
            setOf("Hello,", "World!").forEach { d ->
                /* use a, b, c and d */
            }
        }
    }
}

You could write this similarly with for loops, or differently like so:

val dAction = { d: String -> /* use a, b, c and d */ }
val cAction = { c: Boolean? -> setOf("Hello,", "World!").forEach(dAction) }
val bAction = { b: Int -> setOf(true, false, null).forEach(cAction) }
val aAction = { a: Any? -> setOf(0, 1).forEach(bAction) }
setOf(foo, bar, baz).forEach(aAction)

But I don't think that's any better, because there are some readability issues here: d, c, b and a's actions are written in reverse. Their type specifications cannot be inferred, so they must be specified. It's reversed sequentially compared to the pyramid of doom. The order of the sets providing the possible values should not matter, but it does: you just want to create any combinations from a bunch of sets, however, in this code every line depends on the previous.

It would be very nice to have an idiomatic way of doing something like Python's or Haskell's comprehensions, in which you (almost like the mathematical notation) can do something like:

{ /* use a, b, c and d */
    for a in setOf(foo, bar, baz),
    for b in setOf(0, 1),
    for c in setOf(true, false, null),
    for d in setOf("Hello,", "World!")
}

Which is very easy to read: there is no excessive indentation, the action you're interested in goes first, the data sources are very clearly defined, etc.

Side note: similar problems occur with flatMap-flatMap-...-flatMap-map.

Any ideas about how to neatly create n-ary cartesian products in Kotlin?

https://gist.github.com/kiwiandroiddev/fef957a69f91fa64a46790977d98862b — Jahan Zinedine, Nov 15 '21 at 09:52
@Jahan that is a nice compact solution for 2 inputs and the output uses the stdlib `Pair` tuple type, with type information. The same could be made for `Triple` tuples. See my answer below for a more general solution for any size. See other answers for other approaches, e.g. using Arrow-KT. That lib also provides typed tuple types for many numbers of parameters, e.g. see here: https://arrow-kt.io/docs/meta/apidocs/prelude/arrow.tuples/index.html. — Erik, Nov 15 '21 at 13:36

Erik · Accepted Answer · 2020-06-02T21:09:07.887

27

I've created a solution myself, so I don't have to add a dependency as suggested by Omar's answer.

I created a function that takes two or more sets of any size:

fun cartesianProduct(a: Set<*>, b: Set<*>, vararg sets: Set<*>): Set<List<*>> =
    (setOf(a, b).plus(sets))
        .fold(listOf(listOf<Any?>())) { acc, set ->
            acc.flatMap { list -> set.map { element -> list + element } }
        }
        .toSet()

Example:

val a = setOf(1, 2)
val b = setOf(3, 4)
val c = setOf(5)
val d = setOf(6, 7, 8)

val abcd: Set<List<*>> = cartesianProduct(a, b, c, d)

println(abcd)

Output:

[[1, 3, 5, 6], [1, 3, 5, 7], [1, 3, 5, 8], [1, 4, 5, 6], [1, 4, 5, 7], [1, 4, 5, 8], [2, 3, 5, 6], [2, 3, 5, 7], [2, 3, 5, 8], [2, 4, 5, 6], [2, 4, 5, 7], [2, 4, 5, 8]]

The function cartesianProduct returns a set of lists. There's a number of problems with these lists:

Any type information is lost, because the returned set contains lists that contain the union of types of the input sets. The returned type of these lists' elements is Any?. The function returns a Set<List<*>>, i.e. Set<List<Any?>>.
By definition, the size of the lists isn't known; they are not like a Kotlin Pair or Triple, where the size is a constant by definition. However, the size of these lists/tuples should be equal to the number of input sets, i.e. 4 in the example above.

However, using reflection, we can solve these problems. The action we want to take with every list can be written as a function (e.g. a constructor of some class, which is also just a function):

data class Parameters(val number: Int, val maybe: Boolean?) {
    override fun toString() = "number = $number, maybe = $maybe"
}

val e: Set<Int> = setOf(1, 2)
val f: Set<Boolean?> = setOf(true, false, null)

val parametersList: List<Parameters> = cartesianProduct(e, f).map { ::Parameters.call(*it.toTypedArray()) }

println(parametersList.joinToString("\n"))

Output:

number = 1, maybe = true
number = 1, maybe = false
number = 1, maybe = null
number = 2, maybe = true
number = 2, maybe = false
number = 2, maybe = null

The signature of the transform (::Parameters in the example) specifies the contract for the lists' contents.

Because map { ::Parameters.call(*it.toTypedArray()) } is not very nice, I've created a second extension function that does it for me:

fun <T> Set<List<*>>.map(transform: KFunction<T>) = map { transform.call(*it.toTypedArray()) }

With that, the code becomes quite idiomatic:

val parametersList: List<Parameters> = cartesianProduct(e, f).map(::Parameters)

The code is available from this GitHub Gist, where I will update it if I ever improve it. There are also tests: the cartesian product that includes any empty set returns the empty set, as is mathematically expected. I'm neither saying that this is an optimal solution, nor that it is mathematically sound (not every mathematical property is explicitly implemented and tested), but it works for the question's purpose.

edited Jun 02 '20 at 21:09

answered Dec 13 '18 at 14:19

Erik

4,305
3
36
54

With this answer there is still the issue of type variance, because the star projections don't know or care about the types in the sets that are passed to this functions. This can later result in issues when using the types, because they now have type `Any?` and have to be downcasted. – Erik Dec 13 '18 at 15:22
^ I was just about to mention this, you lose not only type inference, but also can make mistakes like `cartesianProduct(setOf(1, 2), setOf("a", "b")).forEach { (a, b, c) -> println("a=$a, b=$b, c=$c") }` an it will compile fine but crash at runtime – Omar Mainegra Dec 13 '18 at 15:49
Also the returned value is a `List` not a product type like `TupleN` or an *heterogeneous list* which can bring other issues (i.e. mistakely sort the list) – Omar Mainegra Dec 13 '18 at 15:55
Ah, I hadn't thought about sorting yet. In my specific use case the order doesn't matter, so it wasn't an issue. Although, on my Gist there is one test that asserts equality, which fails for a different order. This doesn't guarantee the order isn't touched, though. But my implementation should have a stable order. I'm still looking into a way to improve this implementation with knowledge about types and the number of elements. If you have ideas, feel free to contribute to my answer or elsewhere. – Erik Dec 13 '18 at 16:32
The only way I see it it to have a function of each arity, like: `fun cartesianProduct(a: List , b: List): List> = TODO()` `fun cartesianProduct(a: List , b: List, c: List): List> = TODO()` – Omar Mainegra Dec 13 '18 at 17:06
2

You're right. Looking at Arrow's source, they have data class definitions for tuples from two through twenty-two elements. Any larger amount of arguments is unlikely and probably is too computationally heavy. I'll leave my answer for reference as a naive implementation in which type information is erased, which still can be useful for whomever doesn't want Arrow as an extra dependency. – Erik Dec 14 '18 at 13:01
I've found another way, editing and updating the answer right now. – Erik Dec 17 '18 at 08:56
Your solution still lacks type safety, and will crash at runtime if the function passed as parameter is not the same arity or signature (i.e `fun abc(a: String, b: String, c: String)` – Omar Mainegra Dec 17 '18 at 16:54
A slight improvement to this implementation would be to not define it for zero or one input arguments: you cannot create the cartesian product of zero or one set; it's always two or more. Furthermore, it should be possible to also define the product operator for two sets, so that in code the cartesian product can be created by simply calling `val cartesianProduct = setA * setB`. – Erik Apr 29 '20 at 08:46
If one of the sets is empty, this will not work. A quick workaround is: if (set.isEmpty()) acc else acc.flatMap { list -> set.map{ element -> list + element } } – Al A Jun 02 '20 at 15:16
@AlA you are, unfortunately, incorrect, because _by definition_ the cartesian product of any set with the empty set is the empty set. That is what my solution does return, so there is no need for a workaround. – Erik Jun 02 '20 at 20:21
I am afraid this is wrong : `cartesianProduct(setOf(1,2),setOf(1,2)) => [[1],[2]]` – laurent Apr 25 '22 at 15:02

score 10 · Answer 2 · edited Jun 20 '20 at 09:12

10

I would recommend using Arrow-kt's Applicative on List. See the example:

val ints = listOf(1, 2, 3, 4)
val strings = listOf("a", "b", "c")
val booleans = listOf(true, false)

val combined = ListK.applicative()
    .tupled(ints.k(), strings.k(), booleans.k())
    .fix()

// or use the shortcut `arrow.instances.list.applicative.tupled`
// val combined = tupled(ints, strings, booleans)

combined.forEach { (a, b, c) -> println("a=$a, b=$b, c=$c") }

Which produces the Cartesian product

a=1, b=a, c=true

a=1, b=b, c=true

a=1, b=c, c=true

a=2, b=a, c=true

a=2, b=b, c=true

a=2, b=c, c=true

a=3, b=a, c=true

a=3, b=b, c=true

a=3, b=c, c=true

a=4, b=a, c=true

a=4, b=b, c=true

a=4, b=c, c=true

a=1, b=a, c=false

a=1, b=b, c=false

a=1, b=c, c=false

a=2, b=a, c=false

a=2, b=b, c=false

a=2, b=c, c=false

a=3, b=a, c=false

a=3, b=b, c=false

a=3, b=c, c=false

a=4, b=a, c=false

a=4, b=b, c=false

a=4, b=c, c=false

edited Jun 20 '20 at 09:12

Community

1
1

answered Dec 12 '18 at 19:44

Omar Mainegra

4,006
22
30

Nice to see that Arrow has this feature. You'd expect that from a functional programming lib. I prefer not to add an additional dependency just to be able to do this. This answer is great for users of Arrow, or looking for a reason to use Arrow. I'll await a different answer that might shed light on the problem with either some nifty Kotlin construct, or just a plain: no, it can't be done without ugly code. ;) – Erik Dec 13 '18 at 07:32
As of Arrow 0.12.0 this method is deprecated, and it does not work anymore with Arrow 0.13.0. Very unfortunate, I found it elegant and succinct. – Danilo Pianini Apr 09 '21 at 14:58

Anton Kushch · Answer 3 · 2020-06-10T22:21:58.250

Solution which lazily generates a sequence of results. It takes lists though not sets.

fun <T> product(vararg iterables: List<T>): Sequence<List<T>> = sequence {

    require(iterables.map { it.size.toLong() }.reduce(Long::times) <= Int.MAX_VALUE) {
        "Cartesian product function can produce result whose size does not exceed Int.MAX_VALUE"
    }

    val numberOfIterables = iterables.size
    val lstLengths = ArrayList<Int>()
    val lstRemaining = ArrayList(listOf(1))

    iterables.reversed().forEach {
        lstLengths.add(0, it.size)
        lstRemaining.add(0, it.size * lstRemaining[0])
    }

    val nProducts = lstRemaining.removeAt(0)

    (0 until nProducts).forEach { product ->
        val result = ArrayList<T>()
        (0 until numberOfIterables).forEach { iterableIndex ->
            val elementIndex = product / lstRemaining[iterableIndex] % lstLengths[iterableIndex]
            result.add(iterables[iterableIndex][elementIndex])
        }
        yield(result.toList())
    }
}

All the credits for the algorithm go to Per L and his answer here. Tested with generating 5x5 2-d arrays with char, on my 2 core machine takes ~4.4 seconds to generate 33554432 products.

score 2 · Answer 4 · answered Nov 21 '21 at 06:34

I've created a simple util at https://github.com/wellithy/pub/blob/master/src/main/kotlin/com/arxict/common/Cartesian.kt which supports N-D Cartesian product as well as a simple 2-D. I've chosen Sequence instead of Collection or Set.

typealias Family<T> = Sequence<T> // https://en.wikipedia.org/wiki/Indexed_family
typealias Tuple = Sequence<Any?> // https://en.wikipedia.org/wiki/Tuple
typealias CartesianProduct = Sequence<Tuple> // https://en.wikipedia.org/wiki/Cartesian_product

private val emptyTuple: Tuple = emptySequence()
private val zeroDCartesianProduct: CartesianProduct = sequenceOf(emptyTuple)

fun <T> Family<T>.toCartesianProduct(tuple: Tuple): CartesianProduct =
    map(tuple::plus)

fun <T> CartesianProduct.addFamily(family: Family<T>): CartesianProduct =
    flatMap(family::toCartesianProduct)

fun Sequence<Family<*>>.toCartesianProduct(): CartesianProduct =
    fold(zeroDCartesianProduct, CartesianProduct::addFamily)

fun <T, U> Family<T>.cartesianProduct(other: Family<U>): Sequence<Pair<T, U>> =
    flatMap { other.map(it::to) }

score 2 · Answer 5 · answered Jan 06 '22 at 06:56

2

/**
 * E.g.
 * cartesianProduct(listOf(1, 2, 3), listOf(true, false)) returns
 *  [(1, true), (1, false), (2, true), (2, false), (3, true), (3, false)]
 */

fun <T, U> cartesianProduct(c1: Collection<T>, c2: Collection<U>): List<Pair<T, U>> {
  return c1.flatMap { lhsElem -> c2.map { rhsElem -> lhsElem to rhsElem } }
}

Courtesy of https://gist.github.com/kiwiandroiddev/fef957a69f91fa64a46790977d98862b

answered Jan 06 '22 at 06:56

Jahan Zinedine

14,616
5
46
70

This is not n-ary (only works for two collections). – jayjw Dec 16 '22 at 21:32

Cam Hashemi · Answer 6 · 2020-04-29T09:59:11.150

1

Here's another way to do it with an arbitrary combiner function:

fun <T, U, V> product(xs: Collection<T>, ys: Collection<U>, combiner: (T, U) -> V): Collection<V> =
    xs.flatMap { x ->
        ys.map { y ->
            combiner(x, y)
        }
    }

operator fun <T, U> Set<T>.times(ys: Set<U>): Set<Set<*>> =
    product(this, ys) { x, y ->
        if (x is Set<*>) x + y // keeps the result flat
        else setOf(x, y)
    }.toSet()

operator fun <T, U> List<T>.times(ys: List<U>): List<List<*>> =
    product(this, ys) { x, y ->
        if (x is List<*>) x + y // keeps the result flat
        else listOf(x, y)
    }.toList()

// then you can do

listOf(1, 2, 3) * listOf(true, false)

// or

setOf(1, 2, 3) * setOf(true, false)

// you can also use product's combiner to create arbitrary result objects, e.g. data classes

edited Apr 29 '20 at 09:59

answered Apr 29 '20 at 09:53

Cam Hashemi

157
1
7

`operator fun Set.times(ys: Set): Set>` must return `Set>`, because e.g. `setOf(1, 2) * setOf(1, 2) must return (among other values) two (not one) lists (not sets) containing `1` and `2`. – Erik Apr 29 '20 at 11:15
Wait, that's an incorrect example in my previous comment. The return type must be `Set>`, but the example sets were incorrect. A correct example: `setOf(1) * setOf(1)` must return two ones `setOf(listOf(1, 1))`, i.e. `[[1, 1]]`, but in your answer you return just one one (`[[1]]`) in the resulting `setOf(setOf(1))`. – Erik Apr 29 '20 at 11:20

Tenfour04 · Answer 7 · 2020-06-15T13:15:27.077

1

Here's an adaptation of @Erik's answer that maintains type safety and can be used in a functional chain:

fun <T> Collection<Iterable<T>>.getCartesianProduct(): Set<List<T>> =
    if (isEmpty()) emptySet()
    else drop(1).fold(first().map(::listOf)) { acc, iterable ->
        acc.flatMap { list -> iterable.map(list::plus) }
    }.toSet()

Here's how I'd rewrite that solution for type safety if we wanted to maintain the requirement for an input size of at least two:

fun <T> cartesianProduct(a: Set<T>, b: Set<T>, vararg sets: Set<T>): Set<List<T>> =
    (setOf(a, b).plus(sets))
        .fold(listOf(listOf<T>())) { acc, set ->
            acc.flatMap { list -> set.map { element -> list + element } }
        }
        .toSet()

edited Jun 15 '20 at 13:15

answered Jun 14 '20 at 22:00

Tenfour04

83,111
11
94
154

Thanks for you contribution. I think this only maintains type safety because this is... typed by `T`. However, not all `Iterable` in the receiving `Collection` need to be of the same type! E.g. see my example inputs with `String`, `Int` and `Boolean`: their common super type is `Any`. Your solution probably works (I haven't tested it), but mine would just as well if typed by just `T`. The downside is that it only takes one type, i.e. `T`. The upside is that this is a preferable implementation if you only have one type `T` in your input iterables. – Erik Jun 15 '20 at 07:10
AFAIK the only fully generic and type-safe solution for a type system like Kotlin's at this time is to implement generic n-tuples (like `Pair` and `Triple`) (like Arrow-kt provides) for various values of n. And then define a cartesian product on them. – Erik Jun 15 '20 at 07:14
What's nice about this answer is that it can be chained in functional style. However, note that the input collection might be empty or contain just one iterable and the cartesian product is, mathematically, only defined for two or more input sets. That's difficult to grasp in a receiver. Also, again mathematically, the input collection should be of sets, not iterables, and the output a list of sets. And then there's also the option to put the inputs in an array, so that would be a suggestion to add this function for, too. – Erik Jun 15 '20 at 07:17
1

@Erik This doesn't enforce the inputs to be of the same type. Kotlin will infer the nearest common type, which could be `Any` or `Any?`. If they are of the same type, you get the benefit of the output already having that type, but even if they're not, a type of `` is more useful than `<*>`, which won't let you inspect the contents. If you wanted to be strict about the mathematical definition, you could throw an `IllegalArgumentException` when the size is less than 2, which I know isn't ideal, but is the only way to prevent it for this extension function pattern. See my edit. – Tenfour04 Jun 15 '20 at 13:04
This works: `val result: Set> = cartesianProduct(setOf(1), setOf("x"), setOf(2.toShort(), null))` – Tenfour04 Jun 15 '20 at 13:08

Idiomatic way to create n-ary cartesian product (combinations of several sets of parameters)

7 Answers7

Linked