I have a data.frame like so:
df <- structure(list(X1 = c("PF00041", "PF00041", "PF00041", "PF00041",
"PF00041", "PF00041", "PF00041", "PF00041", "PF00041", "PF00047",
"PF00041", "PF00041", "PF00041", "PF00054", "PF00054", "PF02210",
"PF07679", "PF07714", "PF07714", "PF07714", "PF07714", "PF07714",
"PF07714", "PF00041", "PF00041", "PF00041"), X2 = c("PF00041",
"PF00041", "PF00041", "PF00041", "PF00041", "PF00041", "PF07679",
"PF07679", "PF07679", "PF13895", "PF00047", "PF00047", "PF00047",
"PF02210", "PF13895", "PF07645", "PF13895", "PF07714", "PF07714",
"PF07714", "PF07714", "PF07714", "PF07714", "PF13895", "PF13895",
"PF13895"), pfam_name.x = c("fn3", "fn3", "fn3", "fn3", "fn3",
"fn3", "fn3", "fn3", "fn3", "ig", "fn3", "fn3", "fn3", "Laminin_G_1",
"Laminin_G_1", "Laminin_G_2", "I-set", "Pkinase_Tyr", "Pkinase_Tyr",
"Pkinase_Tyr", "Pkinase_Tyr", "Pkinase_Tyr", "Pkinase_Tyr", "fn3",
"fn3", "fn3"), pfam_name.y = c("fn3", "fn3", "fn3", "fn3", "fn3",
"fn3", "I-set", "I-set", "I-set", "Ig_2", "ig", "ig", "ig", "Laminin_G_2",
"Ig_2", "EGF_CA", "Ig_2", "Pkinase_Tyr", "Pkinase_Tyr", "Pkinase_Tyr",
"Pkinase_Tyr", "Pkinase_Tyr", "Pkinase_Tyr", "Ig_2", "Ig_2",
"Ig_2"), value.x = c("5", "5", "13", "13", "17", "17", "5", "13",
"17", "18", "5", "13", "17", "11", "11", "12", "14", "6", "6",
"15", "15", "20", "20", "5", "13", "17"), value.y = c("13", "17",
"5", "17", "5", "13", "14", "14", "14", "19", "18", "18", "18",
"12", "19", "8", "19", "15", "20", "6", "20", "6", "15", "19",
"19", "19")), row.names = c(2L, 3L, 4L, 6L, 7L, 8L, 10L, 11L,
12L, 13L, 15L, 16L, 17L, 19L, 20L, 25L, 27L, 29L, 30L, 31L, 33L,
34L, 35L, 38L, 39L, 40L), class = "data.frame")
I'd like to be able to filter this data.frame based on columns value.x and value.y but I don't want to keep the rows that are the switched. for example row 1 has values 5 and 13 respectively and row 3 has values 13 and 5 respectively, I want to get rid of row 3.
I tried sorting at first, but because I have other columns the sort is mixing up all the columns together. For example:
data.frame(unique(t(apply(df, 1, sort))), stringsAsFactors = F)
In this table I could see that Pkinase_Tyr
was now in column X1.