Assign consecutive ID to consecutive grouped data

Question

Say I have the following data frame:

$Name     $Question
Bob       1
Bob       2  ---> Same Bob as above
Amy       1
Amy       2
Bob       1  ---> A different Bob than above, but shares the same name
Bob       2

So in short, names can occur multiple times, but only consecutive name values (up to the max number of questions) should be associated with the same unique identifier (ID). For instance, I'd like to create this column:

$Name     $Question    $ID
Bob       1            1
Bob       2            1
Amy       1            2
Amy       2            2
Bob       1            3
Bob       2            3

Question will always have the same sequence. I.e. unique person will have Questions 1 and 2.

The jank way I can think of doing this is something like

d$ID = rep(seq(1, number_unique_people), max_question_number)

Grouping in dplyr and then using nrow does not work because all the Bob values will be grouped together.

Any ideas?

pomegranate · Accepted Answer · 2016-11-12T18:38:59.680

2

As it turns out, this is trivially easy.

library(data.table)
d$ID = rleid(d$Name)

Thanks Rich Scriven for his comment above!

edited Nov 12 '16 at 18:38

answered Nov 12 '16 at 18:34

pomegranate

755
5
19

Assign consecutive ID to consecutive grouped data

1 Answers1