2

My input is

 df1 <- data.frame(Row=c("row1", "row2", "row3", "row4", "row5"),
                   A=c(1,2,3,5.5,5), 
                   B=c(2,2,2,2,0.5),
                   C= c(1.5,0,0,2.1,3))

It look like this:

#  Row1 1   2   1.5
#  Row2 2   2   0
#  Row3 3   2   0
#  Row4 5.5 2   2.1
#  Row5 5   0.5 3

I want to get the sum of all these pairs of rows, with the following equation. Let's said for Row1 and Row2 pairs: I want to multiply each column's entry and sum them into one final answer, for example-

  • Row1-Row2 answer is (1*2) + (2*2)+ (1.5 *0) = 6
  • Row1-Row3 answer is (1*3) + (2*2) + (1.5*0) = 7

I want to do all analysis for each pairs of row and get a result data frame like this:

row1    row2    6
row1    row3    7
row1    row4    12.65
row1    row5    10.5
row2    row3    10
row2    row4    15
row2    row5    11
row3    row4    20.5
row3    row5    16
row4    row5    34.8

How can I do this with R? Thanks a lot for comments.

ArunK
  • 1,731
  • 16
  • 35
psiu
  • 615
  • 1
  • 10
  • 13

2 Answers2

3
  1. Create all the combinations you need with combn. t is used to transpose the matrix as you expect it to be formatted.
  2. Use apply to iterate over the indices created in step 1. Note that we use negative indexing so we don't try to sum the Row column.
  3. Bind the two results together.

`

ind <- t(combn(nrow(df1),2))
out <- apply(ind, 1, function(x) sum(df1[x[1], -1] * df1[x[2], -1]))
cbind(ind, out)

           out
[1,] 1 2  6.00
[2,] 1 3  7.00
[3,] 1 4 12.65
 .....
Chase
  • 67,710
  • 18
  • 144
  • 161
2

Yes! This is a matrix multiplication! :-))

First, just to prepare the matrix:

m = as.matrix(df1[,2:4])
row.names(m) = df1$Row

and this is the operation, how easy!

m %*% t(m)

That's it!

One tip - you could define the data.frame this way and it will save you the row.names command:

df1 <- data.frame(row.names=c("row1", "row2", "row3", "row4", "row5"),A=c(1,2,3,5.5,5), B=c(2,2,2,2,0.5), C= c(1.5,0,0,2.1,3))
Tomas
  • 57,621
  • 49
  • 238
  • 373
  • add a not lower.tri selection and rearrange in there and you have something... he just wants combinations, not permutations. – John Jul 24 '11 at 02:24
  • thank's for the lower.tri tip, but I guess the rearrange step will be difficult - is there any elegant way? – Tomas Jul 24 '11 at 02:54
  • @Tomas - I'll leave it to you to judge elegance, but `melt` from package reshape can be of use: `require(reshape); subset(melt(out2), as.numeric(X1) > as.numeric(X2))`. +1 for the matrix multiplication bit as well, I wasn't thinking that cleverly. – Chase Jul 24 '11 at 03:09