First, a few pointers.
The separation of a set into disjoint subsets is called a set partition (Wikipedia, MathWorld).
A common way to encode a set partition is a restricted growth string.
The number of set partitions is a Bell number, and they grow fast: for a set of 20 elements, there are 51,724,158,235,372 set partitions.
Here is how encoding works.
Look at the elements in increasing order: 1, 2, 3, 4, ... .
Let c
be the current number of subsets, initially 0
.
Whenever the current element is the lowest element of its subset, we assign this set the number c
, and then increase c
by 1.
Regardless, we write down the number of the subset which contains the current element.
It follows from the procedure that the first element of the string will be 0
, and each next element is no greater than the maximum so far plus one. Hence the name, "restricted growth strings".
For example, consider the partition {1,3},{2,5},{4}
.
Element 1
is the lowest in its subset, so subset {1,3}
is labeled by 0
.
Element 2
is the lowest in its subset, so subset {2,5}
is labeled by 1
.
Element 3
is in the subset already labeled by 0
.
Element 4
is the lowest in its subset, so subset {4}
is labeled by 2
.
Element 5
is in the subset already labeled by 1
.
Thus we get the string 01021
.
The string tells us:
Element 1
is in subset 0
.
Element 2
is in subset 1
.
Element 3
is in subset 0
.
Element 4
is in subset 2
.
Element 5
is in subset 1
.
To get a feel of it from a different angle, here are all partitions of a four-element set, along with the respective reduced growth strings:
0000 {1,2,3,4}
0001 {1,2,3},{4}
0010 {1,2,4},{3}
0011 {1,2},{3,4}
0012 {1,2},{3},{4}
0100 {1,3,4},{2}
0101 {1,3},{2,4}
0102 {1,3},{2},{4}
0110 {1,4},{2,3}
0111 {1},{2,3,4}
0112 {1},{2,3},{4}
0120 {1,4},{2},{3}
0121 {1},{2,4},{3}
0122 {1},{2},{3,4}
0123 {1},{2},{3},{4}
As for pseudocode, it's relatively straightforward to generate all such strings.
We do it recursively.
Maintain the value c
, assign every number from 0
to c
inclusive to the current position, and for each such choice, recursively construct the rest of the string.
Also it is possible to generate them lazily, starting with a string with all zeroes and repeatedly finding the lexicographically next such string, akin to how next_permutation
is used to list all permutations.
Lastly, if you'd like to see more than that (along with the mentioned next
function), here's a bit of self-promotion.
Recently, we did a learning project at my university, which required the students to implement various functions for combinatorial objects with reasonable efficiency.
Here is the part we got for restricted growth strings; I linked the header part which describes the functions in English.