1

A friend of mine asked me recently if I could make in R a list of N numbers separated by step that sum up to 1. For example, all the possible combinations of 3 numbers in the vector seq(0.1,1,0.1) that sum 1. The order is not important.

I've been working with expand.grid but as the number increases, the memory requirement explodes. Since the order is unimportant we thought to reduce the matrix by only considering some of the combinations. If we make it so 111 represents the values c(0.1,0.1,0.1), and 123 represents c(0.1,0.2,0.3), we only considered 111, 112, 113 ... 119, 122, 123, ... 129, 133 ... etc.

Note that we excluded 121, 131 and 132 because they are equivalent to 112, 113 and 123. This considerably reduces the number of combinations to test. To do this, we used nested for loops as follows:

step=.1
lst=list()
for(i in seq(step,1,step)){
  for(j in seq(i,1,step)){
    for(k in seq(j,1,step)){
          print(c(i,j,k));if(sum(c(i,j,k))==1){lst=append(lst,list(c(i,j,k)))}
    }
  }
}
do.call("rbind",lst)

This works but I want to make it more flexible. As it is now, to compare 4 numbers instead of 3 I'd need to write a new for loop. I was thinking of a function like all_comb(vector, N) that is equivalent to the nested loops above but I can't find it nor I know how to implement it elegantly.

Thank you!

Carles Borredá
  • 358
  • 2
  • 8

1 Answers1

1

Such a function might get messy if you have long vectors and high Ns, but it's absolutely feasible. I wrote a quick one and it might be possible to refactor it, but it works.

It creates a list of unique vectors of vector indices that may be added. Then, it uses this list to compute the sums and returns them as a vector with the computation added as names.

all_comb = function(vector,N){
  com = as.list(1:length(vector))
  names(com) = as.character(1:length(vector))
  for(i in 2:N){
    nl = list()
    for(e in com){
      for(vi in 1:length(vector)){
        combo = c(e,vi)
        combo = combo[order(combo)]
        nl[[paste(combo,collapse="+")]]=combo
      }
    }
    com = nl
  }
  
  for(i in 1:length(com)){
    com[[i]] = sum(vector[com[[i]]])
  }
  
  return(unlist(com))
  
}

To test it with your example, I get:

> result = all_comb(seq(0.1,1,.1),3)
> print(head(result,n=15))
 1+1+1  1+1+2  1+1+3  1+1+4  1+1+5  1+1+6  1+1+7  1+1+8  1+1+9 1+1+10  1+2+2  1+2+3  1+2+4  1+2+5  1+2+6 
   0.3    0.4    0.5    0.6    0.7    0.8    0.9    1.0    1.1    1.2    0.5    0.6    0.7    0.8    0.9 

> print(names(result)[result==1])
[1] "1+1+8" "1+2+7" "1+3+6" "1+4+5" "2+2+6" "2+3+5" "2+4+4" "3+3+4"

Which means that the first, first and eigth element equal 1, or the 1st, 2nd and 7th equal 1, and so on.

You can use strsplit() on the names to get the original indices back.

Martin Wettstein
  • 2,771
  • 2
  • 9
  • 15