0

Assuming my dataframe has one column, I wish to add another column to indicate if my ith element is unique within the first i elements. The results I want is:

c1  c2

1    1   
2    1   
3    1   
2    0   
1    0   

For example, 1 is unique in {1}, 2 is unique in {1,2}, 3 is unique in {1,2,3}, 2 is not unique in {1,2,3,2}, 1 is not unique in {1,2,3,2,1}.

Here is my code, but is runs extremely slow given I have nearly 1 million rows.

for(i in 1:nrow(df)){
k <- sum(df$C1[1:i]==df$C1[i]))
if(k>1){df[i,"C2"]=0}
else{df[i,"C2"]=1}
}

Is there a quicker way of achieving this?

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • 2
    Possible duplicate of [Find duplicate values in R](http://stackoverflow.com/questions/16905425/find-duplicate-values-in-r) – Ronak Shah Sep 12 '16 at 10:53

1 Answers1

1

The following works:

x$c2 = as.numeric(! duplicated(x$c1))

Or, if you prefer more explicit code (I do, but it’s slower in this case):

x$c2 = ifelse(duplicated(x$c1), 0, 1)
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214