Partition and Composition (combinatorics) implementation in C++

Question

Given a matrix of size M and N, we want to fill in each row with integer value (>=0) so that it sums up to certain value.

Note that the dimension of M and N are pre-computed using certain formula, so that it is guaranteed to match the fill given the desired condition (i.e. sum_val below).

This is implemented in R under Partition library.

library(partitions)

# In this example, we impose condition 
# that each rows must sum up to 2 in total
# And each row has 5 columns
sum_val <- 2
n <- 5
#The above two parameters are predefined.

t(as.matrix(compositions(sum_val, n)))
      [,1] [,2] [,3] [,4] [,5]
 [1,]    2    0    0    0    0
 [2,]    1    1    0    0    0
 [3,]    0    2    0    0    0
 [4,]    1    0    1    0    0
 [5,]    0    1    1    0    0
 [6,]    0    0    2    0    0
 [7,]    1    0    0    1    0
 [8,]    0    1    0    1    0
 [9,]    0    0    1    1    0
[10,]    0    0    0    2    0
[11,]    1    0    0    0    1
[12,]    0    1    0    0    1
[13,]    0    0    1    0    1
[14,]    0    0    0    1    1
[15,]    0    0    0    0    2

Is there any existing implementation in C++?

You mean you need random values meeting your constraint? Try [this](http://stackoverflow.com/a/16884017/2192494) — Lee Daniel Crocker, Jul 04 '13 at 04:26
@LeeDanielCrocker: not just any random number combinations. It has to be exhaustive. — neversaint, Jul 04 '13 at 06:00
Ah, so you want every partition, in random order, or in some predifined order? — Lee Daniel Crocker, Jul 04 '13 at 08:43
Vincent has it right, below. The n-partitions can be calculated from the n-1 combinations with replacement by taking successive differences. — Lee Daniel Crocker, Jul 04 '13 at 21:26

MvG · Accepted Answer · 2013-07-05T12:02:40.297

Recursive version

Here is a recursive solution. You have a sequence a where you keep track of the numbers you already have set. Each recursive call will assign valid numbers to one of these elements in a loop, before recursively calling that function for the remainder of the list.

void recurse(std::vector<int>& a, int pos, int remaining) {
  if (remaining == 0) { print(a); return; }
  if (pos == a.size()) { return; }
  for (int i = remaining; i >= 0; --i) {
    a[pos] = i;
    recurse(a, pos + 1, remaining - i);
  }
}

void print_partitions(int sum_val, int n) {
  std::vector<int> a(n);
  recurse(a, 0, sum_val);
}

Proof of concept run visible at http://ideone.com/oJNvmu.

Iterative version

Your comment below indicates a performance problem. While it seems very likely that I/O is eating most of your performance, here is an iterative solution which avoids the function call overhead of the recursive approach.

void print_partitions(int sum_val, int n) {
  int pos = 0, last = n - 1;
  int a[n]; // dynamic stack-allocated arrays are a gcc extension
  for (int i = 1; i != n; ++i)
    a[i] = 0;
  a[0] = sum_val;
  while (true) {
    for (int i = 0; i != last; ++i)
        printf("%3d ", a[i]);
    printf("%3d\n", a[last]);
    if (pos != last) {
      --a[pos];
      ++pos;
      a[pos] = 1;
    }
    else {
      if (a[last] == sum_val)
        return;
      for (--pos; a[pos] == 0; --pos);
      --a[pos];
      int tmp = 1 + a[last];
      ++pos;
      a[last] = 0;
      a[pos] = tmp;
    }
  }
}

The general idea and the order in which things are printed is the same as for the recursive approach. Instead of maintaining a counter remaining, all the tokens (or whatever it is you are partitioning) are immediately dropped in the place where they belong for the next partition to be printed. pos is always the last non-zero field. If that is not the last, then you obtain the next partition by taking one token from pos and moving it to the place after that. If it is the last, then you take all tokens from that last place, find the last non-zero place before that and take one token from there as well, then dump all these tokens onto the place after the one where you took the single token.

Demo run at http://ideone.com/N3lSbQ.

@neversaint: At first I though (like some others) that you were asking for *random* partitions. Now that I know better, I have revised my answer. So this is a completely different post from my original one. Hope it helps. — MvG, Jul 04 '13 at 22:03
You save my life! Thanks so much! I learnt a lot from your post. — neversaint, Jul 05 '13 at 00:26
Is there any way I can speed up the algorithm? I tried with `sum_val=150` and `n=5`. R code takes `real 0m3.931s` and C++ `0m23.437s` — neversaint, Jul 05 '13 at 09:00
@neversaint: Some ideas: You could compile with optimization. You could delay the output till you have all the data. You could try different ways to print the result. You could catch the case `pos == a.size()-1` and assign the remaining count to the last field, thus avoiding the loop there. You could replace the recursive function call with a loop, although this would be a lot harder to read. If `n` will always be 5 you could hardcode that like Vincent suggested. You could look at what the R implementation does and copy that. — MvG, Jul 05 '13 at 09:18
@neversaint: I added an iterative solution as well. But it doesn't appear that much faster. Did your time measurements include printing the result? Formatting and printing 22.533.126 lines for the case you mentioned is going to take some time, and I'd be surprised if R could do so much better on that account. But I guess you don't need to print these things, but do some other computation with them. Raw computation is `0.082s` here, opposed to `12.681s` with output to `wc`. — MvG, Jul 05 '13 at 12:10
I can never thank you enough! You are very right, I will do further processing. No printing needed. — neversaint, Jul 05 '13 at 22:43
I'd like to add that MvG's code prints `compositions` of an integer - not `partitions`! The order of items matters in `compositions`, but it does not matter in `partitions`. e.g. [1,3,3,5] is the same `partition` as [1,3,5,3]...but it is a different `composition`. — George Robinson, Jul 06 '19 at 21:20

score 1 · Answer 2 · answered Jul 04 '13 at 13:27

You can implement it yourself: such a partition is defined by 6 integers 0 <= x[0] <= x[1] <= x[2] <= x[3] <= 2; the values in the corresponding row are just the differences x[0]-0, x[1]-x[0], x[2]-x[1], etc. If the number of columns (5) is fixed, you have 4 nested loops; it it is not, you can formulate the problem recursively.

Partition and Composition (combinatorics) implementation in C++

2 Answers2

Recursive version

Iterative version

Linked