I have a data set (data.frame) composed mainly by duplicated rows. I´d like to subset it in order to get only the rows which didn´t have duplicates originally. Example of original df:
df1 <- data.frame( ID = c("1", "1","2","2","3","3","4","4"), V_1 = c(2011,2011,NA,NA,1,10,7,7), V_2= c(5,5,6,6,15,15,8,2))
Answer I´d like to get:
df2 <- data.frame( ID = c("3","3","4","4"), V_1 = c(1,10,7,7), V_2= c(15,15,8,2))
Using unique(df1)
function gives me the complete list of unique rows (in the case of the example, 6). Since my df has more than 200k rows, it is not adequate for me.
I have searched for similar questions: Subset with unique cases, based on multiple columns, Filter rows based on multiple column conditions R, but none of them helped me.