1

I have the following table:

df <- read.table(textConnection("V1 V2 V3
1  T  Y
                                 4  Y  T
                                 1  O  P
                                 2  Q  E
                                 2  E  Q
                                 5  C  V
                                 2  V  C"), header=TRUE)

I want to aggregate based on bidirectional hits on V2 and V3 as a sum of of V1.

so the result will be

V1 V2 V3
5  Y  T       ### the order here does not matter (Y T or T Y are the same)
1  O  P
4  Q  E
7  V  C

I mean can always go for traditional for loop to do this, but i think there should be a better way of doing this in R?

ifreak
  • 1,726
  • 4
  • 27
  • 45

1 Answers1

2

We can do this by sorting through each row for the 'V2' and 'V3' and then do the aggregate of 'V1' grouped by 'V2' and 'V3'

df[2:3] <- t(apply(df[2:3], 1, sort))
aggregate(V1 ~ ., df, sum)
#  V2 V3 V1
#1  O  P  1
#2  E  Q  4
#3  C  V  7
#4  T  Y  5
akrun
  • 874,273
  • 37
  • 540
  • 662
  • the sort works fine and i can see the columns having the same values, but weirdly the aggregate is not doing anything. Why would that be? – ifreak May 08 '18 at 13:31
  • @ifreak I am using your example and it is getting the `sum` with `aggregate`. So, I am not sure what exactly is going on in your end. can you try on a fresh R session? – akrun May 08 '18 at 13:32
  • 1
    ok it worked. I had another unrelated problem. Thank you – ifreak May 08 '18 at 13:58