I have this really simple task I need to do, but I can't seem to be able to solve it in any efficient way. I'm supposed to calculate the number of different periods for separate individuals from a large data frame.
Here's an example of my data:
myData <- tibble(
individuals = c(rep("r1",9),rep("r2",9), rep("r3",9)),
group = c(rep(324,3), rep(326,3), rep(328,3), rep(330,3), rep (332,3), rep(334,3), rep(336,3), rep(338,3), rep(340,3)))
individuals group
<chr> <dbl>
1 r1 324
2 r1 324
3 r1 324
4 r1 326
5 r1 326
6 r1 326
7 r1 328
8 r1 328
9 r1 328
10 r2 330
# ... with 17 more rows
Now I want to create another column, where the first mentioned group of the individual gets 1, the other one 2, the third one 3 and then start the count again for the next individual. The desired outcome would be like this:
individuals group period_number
<chr> <dbl> <dbl>
1 r1 324 1
2 r1 324 1
3 r1 324 1
4 r1 326 2
5 r1 326 2
6 r1 326 2
7 r1 328 3
8 r1 328 3
9 r1 328 3
10 r2 330 1
# ... with 17 more rows
I thought to use the group_by(individuals)
and mutate(period_number =)
functions from dplyr
, but I can't figure out which function to use inside mutate()
. I tried to look from several other questions here (Condtionally create new columns based on specific numeric values (keys) from existing column, How to add column into a dataframe based on condition?), but as the numbers in the group column are not categories but just running id numbers for different periods, I don't think I can use them with e.g. if_else()
.
I'm sure there must be a rather simple solution for this, but I simply can't seem to be able to figure it out. Any help is greatly appreciated!