2

In R I would like to set to zero all entries of matrix a few entries above (and below) matrix diagonal. Example below for N x N matrix with N = 5 and we delete k=3 lines of entries parallel to matrix diagonal :

a1 b1 c1 d1 e1 -->      a1  b1 00 00 00  
b2 a2 b2 c2 d2 -->      b2  a2 b2 00 00  
c3 b3 a3 b3 c3 -->      00  b3 a3 b3 00  
d4 c4 b4 a4 b4 -->      00  00 b4 a4 b4  
e5 d5 c5 b5 a5 -->      00  00 00 b5 a5  

(00 means the same as 0)

for k=2 we have

a1 b1 c1 d1 e1 -->      a1 b1 c1 00 00  
b2 a2 b2 c2 d2 -->      b2 a2 b2 c2 00  
c3 b3 a3 b3 c3 -->      c3 b3 a3 b3 c3  
d4 c4 b4 a4 b4 -->      00 c4 b4 a4 b4  
e5 d5 c5 b5 a5 -->      00 00 c5 b5 a5  

I've wrote simple function based on two consecutive for() loops, but this function is too slow if I deal with large number of small matrices, all matrices are N x N with N in range 400:450 and k is always in range of 350:370, all matrix entries are in range between -1 and 1 (I deal with correlation matrices), the amount of data is a few GB, so I need some vectorized version of function. Is it faster to set entries to zero or to copy choosen entries to new matrix ?

Qbik
  • 5,885
  • 14
  • 62
  • 93
  • 2
    Your answer is here: http://stackoverflow.com/a/13049778/602276 – Andrie Aug 05 '14 at 07:20
  • possible duplicate of [R - min, max and mean of off-diagonal elements in a matrix](http://stackoverflow.com/questions/13049575/r-min-max-and-mean-of-off-diagonal-elements-in-a-matrix) – Thomas Aug 05 '14 at 07:28
  • @Thomas It's not an exact duplicate, so I didn't flag it. The answer is very helpful though, and I used it as the inspiration to post an answer to this question. – Andrie Aug 05 '14 at 07:29
  • @Thomas I edited my question bu adding case where N=5 and k=3 pleasee look at role of parameter k and if you could please remove your vote for closing this question – Qbik Aug 05 '14 at 07:43
  • @Andrie Yes, I thought maybe you weren't going to answer, when you did gave you a +1. Close vote retracted. – Thomas Aug 05 '14 at 07:52

2 Answers2

2

Building on the answer at https://stackoverflow.com/a/13049778/602276

Try this:

mat <- matrix(rnorm(25), nrow=5)

offDiagonal <- function(x, offset=1, diagonal=TRUE){
  off <- (row(x) == (col(x) - offset) | row(x) == (col(x) + offset))  
  if(diagonal) diag(off) <- TRUE
  off
}


mat[!offDiagonal(mat)] <- 0
mat

          [,1]       [,2]       [,3]       [,4]        [,5]
[1,] -2.788105 -0.8399604  0.0000000  0.0000000  0.00000000
[2,]  1.194179 -0.6940815  0.3340976  0.0000000  0.00000000
[3,]  0.000000  0.2256085  0.8885540 -0.3661173  0.00000000
[4,]  0.000000  0.0000000 -1.8024987 -0.1903742 -0.65395419
[5,]  0.000000  0.0000000  0.0000000 -0.4074090  0.08818081

Use the offset argument to specify a wider off-diagonal band:

mat <- matrix(rnorm(25), nrow=5)
mat[!offDiagonal(mat, offset=2)] <- 0
mat

          [,1]       [,2]       [,3]      [,4]       [,5]
[1,] 0.4433126  0.0000000  1.0448629 0.0000000  0.0000000
[2,] 0.0000000 -0.2439675  0.0000000 0.1703401  0.0000000
[3,] 0.2859253  0.0000000 -0.7731495 0.0000000 -0.1722833
[4,] 0.0000000  1.1120145  0.0000000 1.0412452  0.0000000
[5,] 0.0000000  0.0000000  1.0860853 0.0000000 -0.3957575
Community
  • 1
  • 1
Andrie
  • 176,377
  • 47
  • 447
  • 496
  • but what about parameter k from my question, for your implemenation k is always N-2 – Qbik Aug 05 '14 at 07:37
  • I used the argument `offset` to do this. Try setting `offset=2`, `offset=3`, etc. – Andrie Aug 05 '14 at 07:39
  • I've edited my question by adding case with N=5 and k=2 – Qbik Aug 05 '14 at 07:41
  • 1
    @Qbik As I said, use the `offset` argument. I added another example, but the function remains the same. – Andrie Aug 05 '14 at 07:54
  • @Andrie please look on Frank solution, output is correct, but still contain for() loop – Qbik Aug 05 '14 at 08:38
  • @Qbik Please describe what you perceive to be incorrect about my solution. To me, it looks the same, so please tell me what you need extra, or what I'm missing. – Andrie Aug 05 '14 at 08:44
  • line : mat[!offDiagonal(mat, offset=2)] <- 0 gives incorrect solution – Qbik Aug 05 '14 at 14:00
  • If you tell me what is wrong with the solution, I can try and fix it. Right now, I don't understand what you want me to fix. – Andrie Aug 05 '14 at 16:05
  • @Andrie In your second example there are two diagonal "lines" of 0s that immediately flank the diagonal. One of those "lines" is made of these points: `mat[2,1]`, `mat[3,2]`, `mat[4,3]` and `mat[5,4]` which are all 0, but should not be. – Jota Aug 05 '14 at 16:44
  • Perhaps, "off" needs to be `Reduce("|", lapply(seq_len(offset), function(n) (row(x) == (col(x) - n) | row(x) == (col(x) + n))))` or a similar approach? – alexis_laz Aug 05 '14 at 17:53
2
set.seed(42)
d <- matrix(rnorm(25), nrow=5)


zero.them<-function(x, n){
  x2<-x
  diag(x)<-0
  for(i in 1:n){
    diag(x[,(-1:-i)])<-0
    diag(x[(-1:-i),])<-0
    }
  x2[which(x==x2)] <- 0
  return(x2)
}

zero.them(d, 1) # for 1 above and below the diagonal

           [,1]        [,2]       [,3]      [,4]     [,5]
[1,]  1.3709584 -0.10612452  0.0000000  0.000000 0.000000
[2,] -0.5646982  1.51152200  2.2866454  0.000000 0.000000
[3,]  0.0000000 -0.09465904 -1.3888607 -2.656455 0.000000
[4,]  0.0000000  0.00000000 -0.2787888 -2.440467 1.214675
[5,]  0.0000000  0.00000000  0.0000000  1.320113 1.895193

zero.them(d, 2) # for 2 above and below the diagonal

           [,1]        [,2]       [,3]       [,4]       [,5]
[1,]  1.3709584 -0.10612452  1.3048697  0.0000000  0.0000000
[2,] -0.5646982  1.51152200  2.2866454 -0.2842529  0.0000000
[3,]  0.3631284 -0.09465904 -1.3888607 -2.6564554 -0.1719174
[4,]  0.0000000  2.01842371 -0.2787888 -2.4404669  1.2146747
[5,]  0.0000000  0.00000000 -0.1333213  1.3201133  1.8951935
Jota
  • 17,281
  • 7
  • 63
  • 93
  • behaviour of your function is correct, but is it possible to replace for() loop with some other solution ? – Qbik Aug 05 '14 at 08:37