2

This little problem is the bottleneck of a big code which must be repeated at least thousands of times so the main issue here is the speed.

I have a vector of numbers, for example:

v <- c(1,3,5)

I want to know all the combinations I can make with that subset. And set them in a matrix of 0's and 1's for example:

 col1 col2 col3 col4 col5 col6 col7
1  1   0    0    1    1    0    1
3  0   1    0    1    0    1    1
5  0   0    1    0    1    1    1

Actually I'm using the function combn (I think is the fastest way to do it clean, right?)

matrix <- lapply(seq(length(v)),function(i){
              submatrix <- combn(x = 1:length(v), m=i)

#code follows after a brief explanation

I would obtain three matrixes like:

1  2  3

1  1  2
2  3  3

1
2
3

So to get the 1 and 0's matrix I fill it up with a double for. (Here's where probably I could get some speed up)

list_matrix <- lapply(seq(length(v)),function(i){
    submatrix <- combn(x = 1:length(v), m=i)
    1matrix <- matrix(data = 0, nrow = length(v), ncol = dim(submatrix)[2])

    for(k in seq(dim(submatrix)[2]))
       for(j in seq(dim(submatrix)[1]))
           1matrix[submatrix[j,k],k] <- 1

    return(1matrix)   })       

What I've shown is the slowest part of the code. For this example takes aprox 0.012 s. The next step is simple.

What I've gotten is three matrixes:

  col1 col2 col3
1   1   0    0
3   0   1    0
5   0   0    1

  col1 col2 col3
1   1   1    0
3   1   0    1
5   0   1    1

  col1 
1   1   
3   1  
5   1   

Now the process is quite simple and fast.

final_matrix <- list_matrix[[1]]

for(i in seq(2,length(list_matrix))
   final_matrix <- cbind(final_matrix, list_matrix[[i]]

And what this does is pasting the columns to get. It takes 0.0033 s:

 col1 col2 col3 col4 col5 col6 col7
1  1   0    0    1    1    0    1
3  0   1    0    1    0    1    1
5  0   0    1    0    1    1    1

I need to speed this process. I think that the double for or the lapply are slowing down this. If someone could post some help I'll appreciate it.

Thank you.

1 Answers1

1

You could make use of tabulate to simplify your code:

L <- sapply(1:length(v), function(i) combn(length(v),i,FUN=tabulate,nbins=length(v)))
do.call(cbind,L)
#     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
#[1,]    1    0    0    1    1    0    1
#[2,]    0    1    0    1    0    1    1
#[3,]    0    0    1    0    1    1    1

Note that combn itself is slow, so you might want to explore its faster analogues, see e.g. Faster version of combn

Community
  • 1
  • 1
Marat Talipov
  • 13,064
  • 5
  • 34
  • 53