I have a large dataset looking something like this: df<- read.table(text="Var1 Var2
K1 K2
K3 K2
K7 K2
K7 K3
K5 K9
K4 K9", header=TRUE, stringsAsFactors=FALSE)
These are all pairs with a correlation of 1, and I'm looking to group them into clusters in order to collapse a larger dataset later. Is there a simple way of removing rows like K7 K3
because they are part of the K2
group. I want to be able to group rows later based on column 2, so I don't want any 'duplicates' of like a K3
group for example.
Edit: expected output
K1 K2
K3 K2
K7 K2
K5 K9
K4 K9```