2

I've got a dataframe with values:

x y value 
A B 10  
B A 15  
A C 5  
C A 10  
B C 20

df <- data.frame(x = c("A", "B", "A", "C", "B"),
                 y = c("A", "A", "C", "A", "C"),
                 value = c(10, 15, 5, 10, 20))

I would like to summarise this data to each combination of x and y and get the sum of the value per combination. The result would be:

x y value
A B 25  
A C 15  
B C 20

I found this question which is more or less the same question as I have. But the solutions don't work in my case. This is because the values in x and y are strings and min() and max() won't work.

Any ideas how to do this?

Community
  • 1
  • 1
jeroen81
  • 2,305
  • 5
  • 29
  • 41

1 Answers1

3

One option is sort the rows in the first two columns and replace it, use aggregate to get the sum of 'value' by the groups 'x' and 'y'.

df[1:2] <- t(apply(df[1:2], 1, sort))
aggregate(value~., df, sum)
#  x y value
# 1 A B    25
# 2 A C    15
# 3 B C    20
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 2
    Was writing exactly the same thing but couldn't get the desired output. Apparently OP showed and gave different data sets. This seems like a dupe though – David Arenburg Jul 28 '15 at 11:56
  • @DavidArenburg I had the same problem, but then I copied the data that was showed and got the correct answer. May be he had a typo. Sure, it looks like a dupe. – akrun Jul 28 '15 at 11:57
  • 2
    Ok, got a dupe by searching your answers again – David Arenburg Jul 28 '15 at 11:58