2

Say I have a matrix like this:

    x y z f
  a 2 2 3 10
  b 2 3 1 90
  c 2 2 3 10

What I wanted is when x, y and z is equal in two row I want to merge them by their sum on f. Here a and c is identical by x, y and z, and I want to add c's f to a's f.

So what I get will be this:

    x y z f
  a 2 2 3 20
  b 2 3 1 90

How can I do this?

Thanks.

  • 1
    `aggregate(f ~. , m, sum)` if `m` is your matrix – David Arenburg Aug 11 '15 at 12:45
  • @David Arenburg, Is this identical to what you said? `aggregate(dat, by=list(dat[,1],dat[,2],dat[,3]), FUN=sum)` –  Aug 11 '15 at 12:48
  • @herbivor It creates additional columns in the output – akrun Aug 11 '15 at 12:49
  • It is identical to `aggregate(f ~ x + y + z, m, sum)`. It also better to use the formula notation for better readability. – David Arenburg Aug 11 '15 at 12:50
  • @DavidArenburg how can I make this more readable like in the link you marked as the answer? –  Aug 11 '15 at 12:58
  • I don't understand the question. – David Arenburg Aug 11 '15 at 12:59
  • @DavidArenburg As I see you marked this question as a duplicate and referenced this link: http://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group So how can I make it more readable, like in the link. I don't know the formula notation yet. –  Aug 11 '15 at 13:00
  • I don't understand, I showed you `aggregate(f ~ x + y + z, m, sum)`, isn't this readable? – David Arenburg Aug 11 '15 at 13:05
  • It's readable, but in the link you marked as answer(http://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group) it's not formula notation I guess, and that seems more readable to me as I am new to R. Whatever, I'll use the notation one, thanks. –  Aug 11 '15 at 13:07
  • Then, something like `aggregate(m[, 4], by=list(m[,1],m[,2],m[,3]), sum)` probably – David Arenburg Aug 11 '15 at 13:11
  • Thank you so much for your help! –  Aug 11 '15 at 13:13

1 Answers1

1

We can use the 'formula' method from aggregate. Here . denotes all the other variables in the dataset.

aggregate(f~., m1, FUN=sum)

It can be explicitly written as

aggregate(f~x+y+z, m1, FUN=sum)

It would be useful if are using only a subset of variables as the grouping variable.

data

m1 <- structure(c(2L, 2L, 2L, 2L, 3L, 2L, 3L, 1L, 3L, 10L, 90L, 10L
), .Dim = 3:4, .Dimnames = list(c("a", "b", "c"), c("x", "y", 
"z", "f")))
Community
  • 1
  • 1
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Don't think you need to convert to a `data.frame` and this is just a dupe Id guess – David Arenburg Aug 11 '15 at 12:46
  • @DavidArenburg You are right. I didn't check that. – akrun Aug 11 '15 at 12:48
  • @DavidArenburg I didn't know that the formula method will work with matrix without explicitly converting to 'data.frame' as `‘aggregate.formula’ is a standard formula interface to ‘aggregate.data.frame’` in the documentation – akrun Aug 11 '15 at 12:55