0

I have a large matrix, for each cell I want to calculate the average of the numbers falling in the column and row of that specific cell.

As the matrix contains NA values and I'm not interested in those I skip them

How could I speed-up this and do it better?

Thanks

mtx <- matrix(seq(1:25), ncol = 5)
mtx[2,3] <- NA

mean.pos <- mtx
for(i in 1:dim(mtx)[1]){

  for(j in 1:dim(mtx)[2]){

    if(is.na(mtx[i,j])){

    } else {
      row.values <- mtx[i, !is.na(mtx[i,])]

      # -- Remove mtx[i,j] value itself to not count it twice
      row.values <- row.values[-which(row.values == mtx[i,j])[1]]

      col.values <- mtx[!is.na(mtx[,j]),j]
      mean.pos[i,j] <- mean(c(row.values, col.values), na.rm = T)
    }      
  }
}
nourza
  • 2,215
  • 2
  • 16
  • 42
HeyHoLetsGo
  • 137
  • 1
  • 14
  • Your code says that you want to remove values on the row with the same value as your particular cell, but (1) you don't explain that, is that right? (2) ***Caution***, equality of floating point is relative (and not always what you think it should be), see https://stackoverflow.com/q/9508518/3358272. After reading that, please clarify your question. (I suspect that either `rowMeans`/`colMeans` or a couple of less-trivial calls to `apply` will do what you need.) – r2evans May 30 '20 at 17:11
  • Yes sorry, I want to take the average of the column-row. the way For the way I'm doing it I remove for the row vectors, otherwise when I calculate the mean I would be couning that cell twice. – HeyHoLetsGo May 30 '20 at 17:22

1 Answers1

1

This does it without explicitly looping through the elements.

num <- outer(rowSums(mtx, na.rm = TRUE), colSums(mtx, na.rm = TRUE), "+") - mtx
not_na <- !is.na(mtx)
den <- outer(rowSums(not_na), colSums(not_na), "+") - 1
result <- num/den

# check
identical(result, mean.pos)
## [1] TRUE

If there were no NAs then it could be simplified to:

(outer(rowSums(mtx), colSums(mtx), "+") - mtx) / (sum(dim(mtx)) - 1)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341