How to sum over diagonals of data frame

Question

Say that I have this data frame:

     1   2   3   4      
100  8   12  5   14 
99   1   6   4   3   
98   2   5   4   11  
97   5   3   7   2

In this above data frame, the values indicate counts of how many observations take on (100, 1), (99, 1), etc.

In my context, the diagonals have the same meanings:

     1   2   3   4
100  A   B   C   D 
99   B   C   D   E  
98   C   D   E   F 
97   D   E   F   G

How would I sum across the diagonals (i.e., sum the counts of the like letters) in the first data frame?

This would produce:

For example, D is 5+5+4+14

Is this a matrix or a data.frame? (A matrix is easier to perform this on) — David Robinson, Apr 29 '15 at 23:52
data.frame, but converting it to a matrix and back to a data.frame as in @Ben Bolker's answer does the trick. — bill999, Apr 30 '15 at 00:00

Ben Bolker · Accepted Answer · 2015-04-30T02:18:11.413

18

You can use row() and col() to identify row/column relationships.

m <- read.table(text="
    1   2   3   4      
100  8   12  5   14 
99   1   6   4   3   
98   2   5   4   11  
97   5   3   7   2")

vals <- sapply(2:8,
       function(j) sum(m[row(m)+col(m)==j]))

or (as suggested in comments by ?@thelatemail)

vals <- sapply(split(as.matrix(m), row(m) + col(m)), sum)
data.frame(group=LETTERS[seq_along(vals)],sum=vals)

or (@Frank)

data.frame(vals = tapply(as.matrix(m), 
       (LETTERS[row(m) + col(m)-1]), sum))

as.matrix() is required to make split() work correctly ...

edited Apr 30 '15 at 02:18

answered Apr 29 '15 at 23:55

Ben Bolker

211,554
25
370
453

What is the logic for why one needs to convert it to a matrix (instead of leaving it in data.frame) in order to do this? – bill999 Apr 30 '15 at 00:08
2

@BenBolker - row and col work on all "matrix-like" objects with 2 dimensions incl. matrices, data.frames, tables etc. – thelatemail Apr 30 '15 at 00:28
1

Another very similar one: `data.frame(vals = tapply(as.matrix(m), (LETTERS[row(m) + col(m)-1]), sum)) ` – Jota Apr 30 '15 at 00:53

thelatemail · Answer 2 · 2015-04-30T00:40:19.347

7

Another aggregate variation, avoiding the formula interface, which actually complicates matters in this instance:

aggregate(list(Sum=unlist(dat)), list(Group=LETTERS[c(row(dat) + col(dat))-1]), FUN=sum)

#  Group Sum
#1     A   8
#2     B  13
#3     C  13
#4     D  28
#5     E  10
#6     F  18
#7     G   2

edited Apr 30 '15 at 00:40

answered Apr 30 '15 at 00:18

thelatemail

91,185
12
128
188

cryo111 · Answer 3 · 2021-09-06T18:09:36.157

6

Another solution using bgoldst's definition of df1 and df2

sapply(unique(c(as.matrix(df2))),
       function(x) sum(df1[df2 == x]))

Gives

#A  B  C  D  E  F  G 
#8 13 13 28 10 18  2

(Not quite the format that you wanted, but maybe it's ok...)

edited Sep 06 '21 at 18:09

answered Apr 30 '15 at 00:03

cryo111

4,444
1
15
37

1

Forgot to mention that my solution assumes that you have set `options(stringsAsFactors=FALSE)`. – cryo111 Apr 30 '15 at 00:06

score 5 · Answer 4 · answered Apr 29 '15 at 23:57

Here's a solution using stack(), and aggregate(), although it requires the second data.frame contain character vectors, as opposed to factors (could be forced with lapply(df2,as.character)):

df1 <- data.frame(a=c(8,1,2,5), b=c(12,6,5,3), c=c(5,4,4,7), d=c(14,3,11,2) );
df2 <- data.frame(a=c('A','B','C','D'), b=c('B','C','D','E'), c=c('C','D','E','F'), d=c('D','E','F','G'), stringsAsFactors=F );
aggregate(sum~group,data.frame(sum=stack(df1)[,1],group=stack(df2)[,1]),sum);
##   group sum
## 1     A   8
## 2     B  13
## 3     C  13
## 4     D  28
## 5     E  10
## 6     F  18
## 7     G   2

How to sum over diagonals of data frame

4 Answers4

Linked