Hey guys I definitely solved this problem before but I lost my code... Here is a simplification of what I have.
a1 <- c(1,2,4,3,5)
a2 <- c("a","b","b","c","f")
a3 <- c(3,4,"b",1,9)
a4 <- c("c","b",2,"a","d")
a <- cbind(a1,a2,a3,a4)
a1
and a2
are a set as well as a3
and a4
:
I would like to remove the duplicates. So remove rows 3 and 4. This data comes from a blast showing links between genomes and it is 34,000 rows long so a efficient solution would be great.
Thank you so much! I would also be open to doing this in another language.