1

I would like to fast determine top k maximum values in a matrix, and then put those not the top k maximum value as zero, currently I work out the following solution. Can somebody improve these one, since when the matrix have many many rows, this one is not so fast?

thanks.

mat <- matrix(c(5, 1, 6, 4, 9, 1, 8, 9, 10), nrow = 3, byrow = TRUE)
sortedMat <- t(apply(mat, 1, function(x) sort(x, decreasing = TRUE, method =    "quick")))
topK <- 2
sortedMat <- sortedMat[, 1:topK, drop = FALSE]
lmat <- mat
for (i in 1:nrow(mat)) {
  lmat[i, ] <- mat[i, ] %in% sortedMat[i, ]
}
kMat <- mat * lmat

 > mat
 [,1] [,2] [,3]
[1,]    5    1    6
[2,]    4    9    1
[3,]    8    9   10

> kMat
     [,1] [,2] [,3]
[1,]    5    0    6
[2,]    4    9    0
[3,]    0    9   10
eddi
  • 49,088
  • 6
  • 104
  • 155

2 Answers2

3

In Rfast the command sort_mat sorts the columns of a matrix, colOrder does order for each column, colRanks gives ranks for each column and the colnth gives the nth value for each column. I believe at least one of them suit you.

Mike
  • 106
  • 5
1

You could use rank to speed this up. In case there are ties, you would have to decide on a method to break these (e.g. ties.method = "random").

kmat <- function(mat, k){
  mat[t(apply(mat, 1, rank)) <= (ncol(mat)-k)] <- 0
  mat
}
kmat(mat, 2)
##      [,1] [,2] [,3]
## [1,]    5    0    6
## [2,]    4    9    0
## [3,]    0    9   10
shadow
  • 21,823
  • 4
  • 63
  • 77