Suppose I have a data frame (lets call it df) that looks like this (below). I am trying to remove ALL duplicates in a given data frame based on a given column (df$car).
options(stringsAsFactors=F)
car <- c('car1', 'car2', 'car2', 'car3', 'car4', 'car4', 'car4', 'car5', 'car6', 'car6')
location <- c(111,345,345,123,678,678,678,432,232,232)
value <- c(1,1,1,1,2,2,2,2,4,4)
a <- c('AT','ATC','TAT','C','TT','TGGGG','GGC','CC','AA','AT')
b <- c('A', 'TAG','TAG','G','AA','AA','AA','GG','TT','TT')
df <- data.frame(car,location,value,a,b)
> df
car location value a b
1 car1 111 1 AT A
2 car2 345 1 ATC TAG
3 car2 345 1 TAT TAG
4 car3 123 1 C G
5 car4 678 2 TT AA
6 car4 678 2 TGGGG AA
7 car4 678 2 GGC AA
8 car5 432 2 CC GG
9 car6 232 4 AA TT
10 car6 232 4 AT TT
My desired output is the following. I wish to remove ALL columns that have duplicates, not just the unique values.
car location value a b
1 car1 111 1 AT A
4 car3 123 1 C G
8 car5 432 2 CC GG
Please note: I believe this is a different question than others that have posted in the past. Most questions are asking for the unique rows based on a given column, but I'm asking that even those rows be removed. If this is a duplicate post, I'm happy to close this one - I just haven't found what I'm looking for yet! Thanks for your help!