3

My problem is as follows: Suppose we have a quadratic n*n matrix, e.g.

m <- matrix(runif(n^2), n,n)

Now I want to define a function f=function(k) that returns the sum of all matrix entries for which the sum of their row and column number weakly exceeds k. For example, consider the 3*3 matrix

m.ex <- matrix(1:9, 3,3, byrow = T)

which looks like

1 2 3
4 5 6
7 8 9

Then f(2) should give 45 = 1+2+3+4+5+6+7+8+9 (as for every entry in the matrix, the sum of the row and column position weakly exceeds 2), f(4) = 38 = 3+5+6+7+8+9 (as the sum of the row and column position weakly exceeds 4 for positions (1,3), (2,2), (2,3), (3,1), (3,2), and (3,3)), and f(5) = 23 = 6 + 8 + 9 (as the sum of the row and columin position weakly exceeds 5 for positions (2,3), (3,2), and (3,3)). Etc.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
Martin
  • 195
  • 7
  • What have you already tried? And - excuse my ignorance - when does a number h weakly exceed a number k? – Heroka Aug 20 '15 at 12:04
  • I'm currently working on the problem (I need this as a subfunction in a simulation), so I haven't tried much yet. Obviously, I could do this with a loop, but I'm looking for something more elegant (and faster). A number h weakly exceeds a number k if h >= k. – Martin Aug 20 '15 at 12:09
  • ah... sorry... I posted my quick & dirty solution with loops before your comment about not wanting to use loops appeared. I'll take another look. – Yuri Robbers Aug 20 '15 at 12:12
  • Never mind, I still greatly appreciate your input. Thanks for giving it a go! – Martin Aug 20 '15 at 12:16
  • This is very closely related to [this](http://stackoverflow.com/questions/11759503/extracting-off-diagonal-slice-of-large-matrix) – MichaelChirico Aug 20 '15 at 21:21

3 Answers3

4

Without loops, hope it's useful.

library(reshape2)

#easy way to get all row and column indexes is to transform matrix to long
#has advantage of allowing vectorized computation and avoiding for-loops
myfun <- function(k, mm){
  #reshape matrix to easily get column and row numbers
  melt_m <- melt(mm, varnames=c("row","col"))
  #add row and col indixes
  melt_m$sum_row_col <- melt_m$row + melt_m$col
  #calculate result and return (sum of value when sum of rowcol>=k)
  return(sum(melt_m$value[melt_m$sum_row_col>=k]))
}

#example 1
test_m <- matrix(1:9,3,3,byrow=T)


> myfun(k=2,mm=test_m)
[1] 45
> myfun(k=4, mm=test_m)
[1] 38

Example of what melt does with a matrix:

> test_m
     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9
> melt(test_m,varnames=c("row","col"))
  row col value
1   1   1     1
2   2   1     4
3   3   1     7
4   1   2     2
5   2   2     5
6   3   2     8
7   1   3     3
8   2   3     6
9   3   3     9
Heroka
  • 12,889
  • 1
  • 28
  • 38
  • This is of course easily edited if you want the values, but I read your description int hat you wanted the sum --> that is returned. – Heroka Aug 20 '15 at 12:19
  • Thanks a lot, that looks great. I'll take a look at the melt function in order to understand what you are doing. – Martin Aug 20 '15 at 12:21
  • Added example output of melting a matrix. My idea was to convert the data to three vectors (to avoid loops): row indices, column indices and values. From there, it's straightforward. – Heroka Aug 20 '15 at 12:27
  • Perfect, I understand it now. Thanks for providing such a great solution :) – Martin Aug 20 '15 at 12:38
  • By the way, I think there is an even easier method. We can define a function indexsums(n) which returns matrix(1,n, n,n, byrow=T) + 1:n – Martin Aug 20 '15 at 13:05
  • Then we simply return m[indexsums >= k]. – Martin Aug 20 '15 at 13:08
  • That's very elegant; didn't think of it. I tend to do most of my work in long-formatted data, so that's my go-to solution. – Heroka Aug 20 '15 at 13:11
4

The row and column functions make this way simpler than the other solutions, if I understand correctly:

f <- function(k, m) sum(m[row(m) + col(m) >= k])

For your m.ex:

sapply(c(2, 4, 5), f, m = m.ex)
# [1] 45 38 23

For larger examples:

set.seed(1230)
n <- 8
> print(round(m <- matrix(runif(n^2), nrow = n), 2))
     [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0.57 0.87 0.94 0.98 0.87 0.66 0.16 0.98
[2,] 0.65 0.79 0.68 0.74 0.12 0.65 0.56 0.73
[3,] 0.76 0.85 0.71 0.45 0.64 0.45 0.12 0.55
[4,] 0.26 0.09 0.67 0.66 0.58 0.48 0.54 0.20
[5,] 0.38 0.63 0.27 0.16 0.20 0.96 0.05 0.90
[6,] 0.49 0.48 0.71 0.32 0.46 0.98 0.17 0.96
[7,] 0.91 0.99 0.97 0.98 0.84 0.21 0.21 0.44
[8,] 0.62 0.08 0.80 0.88 0.85 0.30 0.61 0.42
> f(12, m)
[1] 8.028652

This can be confirmed by noting the entries you denote are those in the lower-right triangle:

   *    *    *    *    *    *    *    *
   *    *    *    *    *    *    *    *
   *    *    *    *    *    *    *    *
   *    *    *    *    *    *    * 0.20
   *    *    *    *    *    * 0.05 0.90
   *    *    *    *    * 0.98 0.17 0.96
   *    *    *    * 0.84 0.21 0.21 0.44
   *    *    * 0.88 0.85 0.30 0.61 0.42

So the sum is 0.88+0.84+0.85+0.98+0.21+0.3+0.05+0.17+0.21+0.61+0.2+0.9+0.96+0.44+0.42 which is about 8.03.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
  • Now that is a fantastically elegant solution! Easy to read and understand, and swift to execute. I vaguely recalled that something like `row` and `column` existed, but couldn't remember them for the life of me, nor find them. Wrong search terms I guess. I did a little benchmarking. On a 1000x1000 matrix filled with random numbers, this solution is about twice as fast as Heroka's, which is more than a thousand times faster than mine. – Yuri Robbers Aug 20 '15 at 22:32
  • 1
    I was quite blown away myself the first time I saw these functions. super cool but well-hidden base functions – MichaelChirico Aug 20 '15 at 22:41
  • This solution is much more elegant than mine. @Yuri Robbers: Maybe it's better to accept this one as the answer, so it shows up on top when others search for the same problem. – Heroka Aug 21 '15 at 08:08
  • @Heroka: I would if I could, but it's not my question, so Martin has to do that... – Yuri Robbers Aug 21 '15 at 09:55
  • Oops, confused about who was OP. Sorry. @Martin: Can you change the best answer? – Heroka Aug 21 '15 at 10:08
  • Changed the best answer. Thank you MichaelChirico for providing this elegant solution! – Martin Aug 22 '15 at 08:57
2

Well, it's slow and it's ugly, and I'm sure many people will come up with better, faster and more beautiful solutions, but this will do the trick for you:

weakly_exceeds_sum <- function(m, k){
    tmp <- NULL
    for(i in 1:nrow(m)){
        for(j in 1:nrow(m)){
            if(i+j>=k){
            tmp<-c(tmp, m[i,j])
            }
        }
    }
    sum(tmp)
}

where you'd call the function with, for example: weakly_exceeds_sum(m.ex, 2)

Yuri Robbers
  • 291
  • 1
  • 10
  • 1
    Thanks a lot for your effort, works perfectly. This would have been my approach as well, but ideally I'd be looking for something that avoids loops. Essentially, we just want to extract values lying at and below a line parallel to the matrix's off diagonal. – Martin Aug 20 '15 at 12:15