R use of lapply & sapply to interrogate list of matrices containing permuted data

Question

I have 3 datasets I permuted 10x. Each permutation of a dataset makes a column in 3 matrices (one matrix per dataset). These 3 matrices (m1,m2,m3) are in a list L. I would like to interrogate all possible combinations (10x10x10=1000) for each entry (in this case 4). I have used expand.grid to supply all combinations of column calls across the 3 matrices in the row of another matrix M:

M<-expand.grid(seq(1:10),seq(1:10),seq(1:10))

Here is my data in a list:

m1<-matrix(c(1,2,1,0,3,2,1,2,3,4),nrow=4, ncol=10)
m2<-matrix(c(m1[1,]),nrow=4,ncol=10)
m3<-matrix(c(m1[2,]),nrow=4,ncol=10)
L<-list(m1, m2, m3)

Can you help me use do.call, cbind, lapply/sapply to efficiently retrieve the column coordinates from M to interrogate the corresponding columns in the 3 matrices contained in L and bind them to a new matrix as such:

m.res<-for (i in 1:nrow(M) { "get" L[[1:3]][M[i,]] }

For i=1, m.res would yield:

I clearly need a tutorial for lapply/sapply as this should not be this difficult.

I changed `test` to `M` in your code. If that was not what you intended, please add a definition for `test`. — nico, Nov 02 '13 at 07:57
+1 for example input data and desired output. Next time also show what you tried and *how* it didn't work and it will be a perfect question. — Simon O'Hanlon, Nov 02 '13 at 12:10
@SimonO101 good point! I don't have my .Rhistory on this laptop, but I remember I tried various schemes to pass the elements from M directly to L to subset the desired columns. Then I realized I needed to use apply — reviewer3, Nov 03 '13 at 00:37

score 2 · Accepted Answer · edited May 23 '17 at 10:32

First of all we should get the correct way of retrieving one single line. So, for line 1 (1,1,1)

We want to loop over the three elements of L and retrieve a matrix with the indices of row 1 in M

col.ids <- unlist(M[1,])
# sapply will already return the columns in a matrix
# We use seq_along rather than looping directly on L, because we also need the
# id for col.ids
sapply(seq_along(L), function(id){
                        L[[id]][ ,col.ids[id] ]
                        })

     [,1] [,2] [,3]
[1,]    1    1    2
[2,]    2    3    2
[3,]    1    3    4
[4,]    0    1    0

Now just put that in another apply statement and you're set!

This time we use apply, and loop directly over the rows of M (thus eliminating the need of the col.ids variable)

# The second parameter is 1 for rows and 2 for columns
m.comb <- apply(M, 1, function(cols)
                      {
                      sapply(seq_along(L), function(id){
                                               L[[id]][ ,cols[id] ]
                                               })
                      })

Now, apply gives us a big 12 x 1000 matrix, which is very annoying in this case, so we should change it into a list... which I will leave as an exercise to the reader...

... or rather use the alply function of the plyr package, which works exactly like apply but always returns a list (see Force apply to return a list )

In this case, however, we need to unlist cols

library(plyr)
m.comb.2 <- alply(M, 1, function(cols)
                       {
                       cols <- unlist(cols)
                       sapply(seq_along(L), function(id)
                              {
                              L[[id]][ ,cols[id] ]
                              })
                       })

And finally...

m.comb.2[[1]]

     [,1] [,2] [,3]
[1,]    1    1    2
[2,]    2    3    2
[3,]    1    3    4
[4,]    0    1    0

m.comb.2[[10]]

     [,1] [,2] [,3]
[1,]    1    1    2
[2,]    2    3    2
[3,]    3    3    4
[4,]    4    1    0

That works beautifully, and thank you for explaining it so well!! I may use the plyr package in the end, but I will first play around with coercing m.comb into a list (first thing tomorrow :-) This was a very useful answer, I wish I could upvote it more than once! — reviewer3, Nov 03 '13 at 00:53
Thank you @nico in the meantime, I've consulted this question [6819804](http://stackoverflow.com/questions/6819804/) to use `split()` for converting the matrix to a list. This way I end up a list of 1000 vectors of (4*3 size), which I could handle by proper subsetting, however... — reviewer3, Nov 04 '13 at 06:31
...I now realize I have a new problem: 24 datasets x 34000 datapoints x 1000 shuffles x 8 bytes = 6.6 Gb just for the shuffled data... lack of free memory. I will probably end up having to post this as a separate question. Ugh - now I have to worry about memory management, too. Anyway, not really part of this question. — reviewer3, Nov 04 '13 at 06:32

score 2 · Answer 2 · answered Nov 02 '13 at 11:56

[I would add that as comment to @nico 's answer, but I wanted it to be more clean and extended than a comment can be. If @nico finds it useful to add it in his detailed answer, my answer should be deleted.]

You can, also, use mapply, i.e. you can apply a retrieving function to multiple arguments (since there are only 3 arguments), which you already have in M.

#`M` is your dataframe of arguments and `L` is your list of matrices
#save all results to a list (`myls`)
myls <- mapply(function(colmat1, colmat2, colmat3) 
               { cbind(L[[1]][,colmat1], L[[2]][,colmat2], L[[3]][,colmat3]) }, 
                      M[,1], M[,2], M[,3], SIMPLIFY = F)

myls[[1]]
#     [,1] [,2] [,3]
#[1,]    1    1    2
#[2,]    2    3    2
#[3,]    1    3    4
#[4,]    0    1    0
myls[[10]]
#     [,1] [,2] [,3]
#[1,]    1    1    2
#[2,]    2    3    2
#[3,]    3    3    4
#[4,]    4    1    0

Good answer! I always tend to forget `mapply` for some reason :) — nico, Nov 02 '13 at 15:03
@alexis_laz Thank you, for this solution, but I need something that I can scale effortlessly to the number of permutations. I didn't include this info in the question, but I want to shuffle the data 100 or 1000 times over. Still this is very helpful, as I haven't used mapply before! Thank you! — reviewer3, Nov 03 '13 at 00:40

R use of lapply & sapply to interrogate list of matrices containing permuted data

2 Answers2