3

I have this dataframe:

      [,1]            [,2]           
 [1,] "CHC.AU.Equity" "SGP.AU.Equity"
 [2,] "CMA.AU.Equity" "SGP.AU.Equity"
 [3,] "AJA.AU.Equity" "AOG.AU.Equity"
 [4,] "AJA.AU.Equity" "GOZ.AU.Equity"
 [5,] "AJA.AU.Equity" "SCG.AU.Equity"
 [6,] "ABP.AU.Equity" "AOG.AU.Equity"
 [7,] "AOG.AU.Equity" "FET.AU.Equity"
 [8,] "SGP.AU.Equity" "CHC.AU.Equity"

How would one filter for just the unique pairs? E.g. - in the df above, row 8 would 'match' row 1, and be excluded. I am trying to use setequal(), but I cant seem to get it to work. Is there a 'setunique' type funtion?

lukehawk
  • 1,423
  • 3
  • 22
  • 48
  • Did you try `unique(dataFrame)`? – OganM Dec 29 '16 at 15:43
  • 1
    from what I can tell, unique does not see equality between the unordered pairs. Does not seem to exclude the 'match'. Actually, it doesnt seem to work on ordered pairs, either. 'duplicated' seems to do the trick. – lukehawk Dec 29 '16 at 15:44

1 Answers1

2

We can try with apply to loop through the rows, sort the elements, transpose the output, apply the duplicated, negate it to return a logical index of TRUE/FALSE for unique and duplicates and use that to subset the rows.

m1[!duplicated(t(apply(m1, 1, sort))),]
#         [,1]            [,2]           
#[1,] "CHC.AU.Equity" "SGP.AU.Equity"
#[2,] "CMA.AU.Equity" "SGP.AU.Equity"
#[3,] "AJA.AU.Equity" "AOG.AU.Equity"
#[4,] "AJA.AU.Equity" "GOZ.AU.Equity"
#[5,] "AJA.AU.Equity" "SCG.AU.Equity"
#[6,] "ABP.AU.Equity" "AOG.AU.Equity"
#[7,] "AOG.AU.Equity" "FET.AU.Equity"
akrun
  • 874,273
  • 37
  • 540
  • 662