Subset only rows containing complete cases from list of matrices

Question

I have a list of matrices, all of equal dimensions. Each matrix within the list represents a different specimen; each matrix contains three columns for X, Y and Z coordinates, and each row represents a different point in 3D space (i.e., an identifiable landmark).

Most specimens are missing coordinate data for particular landmarks (so that all three columns contain NAs). I would like to subset all matrices in the list so that they only include landmarks/rows containing complete data (i.e., no NAs exist in that row for any of the specimens/matrices in the entire list).

I fear this may be quite a complicated task for data stored in list format. As all the matrices have the same dimensions, would it be easier to convert the data to an array? I wanted to avoid doing this as it would (I believe) strip the row, column and list-element names I use to identify the data.

Welcome to SO. Please read [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) to know how to ask a good question. — agstudy, Dec 16 '13 at 12:18

score 2 · Accepted Answer · answered Dec 16 '13 at 12:15

2

For example, using complete.cases:

res <- lapply(your_list,function(mat)
                   mat[complete.cases(mat),]

An if your matrices, have the same number of columns, you can put the result in a big matrix using something like:

do.call(rbind,res)

answered Dec 16 '13 at 12:15

agstudy

119,832
17
199
261

Thanks; this is nearly what I wanted. While this removes every row with NAs, matrices may now have different lengths. If one matrix contains a row of missing data, the equivalent row needs to be removed from all the other matrices, too. Is this possible? – Roger Dec 16 '13 at 13:10
I asked this as a separate question because I figured it was a different problem, and would be of more general interest. http://stackoverflow.com/questions/20612625/retain-only-rows-present-in-all-matrices – Roger Dec 16 '13 at 14:06

score 0 · Answer 2 · answered Dec 16 '13 at 14:09

The best thing to do is first use

do.call(rbind,res)

then with a single matrix containing all list sub matrices add one more column And one more column to label the rows of each sub matrix. So if your your sub matrices has 3 rows each, the this column will look like: 1,2,3,1,2,3,...,1,2,3 e.g

    singleMatrix=do.call(rbind,res)

rowindex=rep(c(1:numberOfRowsOfSubMatrix,numberOfSubMatrices) Then form a combined data frame with the indicator, singMatrix and rowindex

Matrix=data.frame(singleMatrix,indicator,rowindex)

Now if indicator==0 delete the row and delete all rows with thesame rowindex number.

Thanks, but the problem with this approach is that each matrix can have a different number of rows depending on the incidence of NAs (which varies from matrix to matrix). Furthermore, where does 'indicator' come from? — Roger, Dec 16 '13 at 14:48

Subset only rows containing complete cases from list of matrices

2 Answers2