In a data frame I have a column which sometimes have repeated values for the same id, column A. When there are similar values for the same id in column A, I just want to keep the first. Imagine a big data set. How do I accomplish this? Thanks!
A <- c(18,6,39,39,3,56)
set.seed(1)
B <- sample(100,6)
set.seed(2)
C <- sample(100,6)
df <- data.frame(id = rep(1:3, each=2),A,B,C)
df
id A B C
1 1 18 68 85
2 1 6 39 79
3 2 39 1 70
4 2 39 34 6
5 3 3 87 32
6 3 56 43 8
id <- unique(df$id)
if (i in 1:length(id)){
df[df$id==i,]
if(length(df[df$A])>1){
keep the first
}
else{
return(df)
}
}
Expected output:
id A B C
1 1 18 68 85
2 1 6 39 79
3 2 39 1 70
5 3 3 87 32
6 3 56 43 8