0

The following works as expected:

m <- matrix (c(1, 2, 3,
               1, 2, 4,
               2, 1, 4,
               2, 1, 4,
               2, 3, 4,
               2, 3, 6,
               3, 2, 3,
               3, 2, 2), byrow=TRUE, ncol=3)

df <- data.frame(m)

aggdf <- aggregate(df$X3, list(df$X1, df$X2), FUN=sum)
colnames(aggdf) <- c("A", "B", "value")

and results in:

  A B value
1 2 1     8
2 1 2     7
3 3 2     5
4 2 3    10

But I would like to treat rows 1/2 and 3/4 as equal, not caring whether observation A is 1 and B is 2 or vice versa.

I also do not care about how the aggregation is sorting A/B in the final data.frame, so both of the following results would be fine:

  A  B  value
1 2  1    15
2 3  2    15


  A  B  value
1 1  2    15
2 2  3    15

How can that be achieved?

fredson
  • 23
  • 2

1 Answers1

1

You need to get them in a consistent order. For just 2 columns, pmin and pmax work nicely:

df$A = with(df, pmin(X1, X2))
df$B = with(df, pmax(X1, X2))
aggregate(df$X3, df[c("A", "B")], FUN = sum)
#   A B  x
# 1 1 2 15
# 2 2 3 15

For more columns, use sort, as akrun recommends:

df[1:2] <- t(apply(df[1:2], 1, sort))

By changing 1:2 to all the key columns, this generalizes up easily.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294