0

I have a dataset with the following information:

> Column1 Column2 Sum 
a         b        50 
b         a         7 
c         a         1
d         e         8
c         a         2

I want to aggregate to get this result:

> Column1 Column2 Sum 
a         b        57 
c         a         3
d         e         8

Because a-b its the same that b-a

Any way to do this? Thanks

David C.
  • 1,974
  • 2
  • 19
  • 29
RMteam
  • 113
  • 1
  • 12
  • Hi @Jaap i didnt find this information before. could you send me that link? thanks – RMteam Feb 06 '17 at 14:49
  • [see here](http://stackoverflow.com/search?q=%5Br%5D+apply+sort+aggregate+sum) – Jaap Feb 06 '17 at 14:51
  • i found http://stackoverflow.com/questions/28360148/take-sum-of-a-variable-if-combination-of-values-in-two-other-columns-are-unique – RMteam Feb 06 '17 at 15:07

1 Answers1

1

We can use aggregate after sort ing the first two columns by row

df1[1:2] <- t(apply(df1[1:2], 1, sort))
aggregate(Sum~., df1, FUN = sum)

Or using pmax/pmin

library(dplyr)
df1 %>%
   group_by(Col1 = pmin(Column1, Column2), Col2 = pmax(Column1, Column2)) %>% 
   summarise(Sum = sum(Sum))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    Is there any performance improvement doing `t(apply(df1[1:2], 1, sort))` instead of `apply(df1[1:2], 2, sort)`? – GGamba Feb 06 '17 at 14:45