0

I have a large dataset looking something like this: df<- read.table(text="Var1 Var2 K1 K2 K3 K2 K7 K2 K7 K3 K5 K9 K4 K9", header=TRUE, stringsAsFactors=FALSE)

These are all pairs with a correlation of 1, and I'm looking to group them into clusters in order to collapse a larger dataset later. Is there a simple way of removing rows like K7 K3 because they are part of the K2 group. I want to be able to group rows later based on column 2, so I don't want any 'duplicates' of like a K3 group for example.

Edit: expected output

K1       K2
K3       K2
K7       K2
K5       K9
K4       K9```
rholeepoly
  • 43
  • 3
  • Could you please provide your expected output – Florian Feb 03 '20 at 15:25
  • @IceCreamToucan this question is different from the one you associated it with, I did use that answer to remove duplicates like K1 K2 vs. K2 K1, and now this is a different question so can you remove that? – rholeepoly Feb 03 '20 at 15:29
  • would the solution just be to remove the duplicates in column 1 possibly? – rholeepoly Feb 03 '20 at 15:40

1 Answers1

0

ok i think i answered my own question with: newdf<-df[!duplicated(df$Var1),]

rholeepoly
  • 43
  • 3