Is there a way to reorder the rows and columns of matrix to create a dense corner, in R?

Question

I have a large matrix which comprises 1,2 and missing (coded as NA) values. The matrix has 500000 rows by 10000 columns. There are approximately 0.05% 1- or 2-values, and the remaining values are NA.

I would like to reorder the rows and columns of the matrix so that the top left corner of the matrix comprises a relatively high number of 1s and 2s compared to the rest of the matrix. In other words, I would like to create a relatively datarich subset of the matrix, by reordering the matrix rows and columns.

Is there an efficient method of achieving this in R, perhaps using a library? I would also be interested in solutions in Python or Java, but I would prefer to perform this in R as it is the language that's most familiar to me.

I thought that there maybe a set of optimisation procedures that I could use as my working matrix is too large to do the reorganisation by eye.

I have reverted a set of edits so that the question remains consistent with the current answers.

With your clarification (and a good night's sleep), I'm thinking a test for "density", or ratio of `is.numeric/is.na` for each row and column could be used to generate the reordering. — Carl Witthoft, May 15 '14 at 11:57
So what would you consider a desirable reordering? Roland's answer does about the best possible unless you want to "weight" values in the very top left more than those a bit farther out. — Carl Witthoft, May 15 '14 at 12:15

score 3 · Answer 1 · edited May 15 '14 at 11:55

3

Like this?

#some sparse data
set.seed(42)
p <- 0.0005
mat <- matrix(sample(c(1, 2, NA), 1e4, TRUE, c(p/2, p/2, 1-p)), ncol=50)

#order columns and rows by the number of NA values in them   
mat <- mat[order(rowSums(is.na(mat))), order(colSums(is.na(mat)))]

#only show columns and rows containing non-NA values
mat[rowSums(!is.na(mat)) > 0, colSums(!is.na(mat)) > 0]
#       [,1] [,2] [,3] [,4] [,5] [,6]
# [1,]   NA   NA   NA   NA    2   NA
# [2,]   NA   NA   NA   NA   NA    2
# [3,]   NA   NA    2   NA   NA   NA
# [4,]   NA    1   NA   NA   NA   NA
# [5,]    1   NA   NA   NA   NA   NA
# [6,]   NA   NA   NA    2   NA   NA

edited May 15 '14 at 11:55

Carl Witthoft

20,573
9
43
73

answered May 14 '14 at 14:25

Roland

127,288
10
191
288

1

Probably the two of us should combine our answers into one matrix-crusher :-) – Carl Witthoft May 14 '14 at 15:39
Thanks for this answer. How about for situations where ordering my rows and columns doesn't provide a dense corner? I've edited the question in response to this comment. – joel38237 May 15 '14 at 10:12
Why are you asking me this? The task was to create a dense corner by reordering. If that is not possible, you have to say what you want to do. – Roland May 15 '14 at 10:47
@CarlWitthoft I don't understand your solution. (Or maybe I should say, I don't think it works for the general case.) – Roland May 15 '14 at 11:43
I agree -- I think, sans other weighting functions, your solution does the best reordering possible if the rows and columns are not allowed to be "jumbled" as in my edited approach. – Carl Witthoft May 15 '14 at 12:12
My earlier comment was missing a key point. I'm interested in situations where ordering rows and columns by the number of non-NA values does not provide a dense corner. I've edited the question in response to these comments. – joel38237 May 15 '14 at 12:48
Also, in response to the comment by Carl Witthoft, I should add that rows and columns cannot be "jumbled". – joel38237 May 15 '14 at 13:03
I don't understand your criterion for a "dense corner". Your new example matrix is not very sparse and to expect non-NA values in the 3x3 top corner seems to be arbitrary. If you can show an algorithm that gives your desired result I can show you how to implement it in R. Otherwise I can't help you further. – Roland May 15 '14 at 13:07

Carl Witthoft · Answer 2 · 2014-05-15T11:56:19.747

Something like this?

Rgames> bar
     [,1] [,2] [,3] [,4] [,5]
[1,]   NA   NA   NA   NA   NA
[2,]    1   NA   NA   NA    3
[3,]   NA   NA   NA   NA   NA
[4,]    2   NA   NA   NA    4
[5,]   NA   NA   NA   NA   NA

Rgames> rab<-bar[order(bar[,1]),]
Rgames> rab
     [,1] [,2] [,3] [,4] [,5]
[1,]    1   NA   NA   NA    3
[2,]    2   NA   NA   NA    4
[3,]   NA   NA   NA   NA   NA
[4,]   NA   NA   NA   NA   NA
[5,]   NA   NA   NA   NA   NA
Rgames> rab[,order(rab[1,])]
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3   NA   NA   NA
[2,]    2    4   NA   NA   NA
[3,]   NA   NA   NA   NA   NA
[4,]   NA   NA   NA   NA   NA
[5,]   NA   NA   NA   NA   NA

EDIT, as Roland pointed out, in a more general situation that won't come close. Now, if one is allowed to 'jumble' the rows and columns, this would do it:

for(j in 1:nrow(mfoo)) mat[j,]<-mat[j,order(mat[j,])]

for(j in 1:ncol(mat)) mat[,j]<-mat[order(mat[,j]),j]

I suspect that's not what is desired, so I'll go off and think some more about ordering "criteria"

Thanks for this answer. How about for situations where ordering my rows and columns doesn't provide a dense corner? I've edited the question in response to this comment. — joel38237, May 15 '14 at 10:20

Is there a way to reorder the rows and columns of matrix to create a dense corner, in R?

2 Answers2

Linked