I can not get my head around this must be simple task. How to get a group label as a consecutive number.
library(dplyr)
set.seed(1)
df <- data.frame(id = sample(c('a','b'), 20, T),
name = sample(c('N1', 'N2', 'N3'), 20, T),
val = runif(20)) %>%
group_by(id) %>%
arrange(id, name)
What I want is a label group_no
that indicates the number of categories of the variable name
within each id
dplyr group. I can not find a solution in the dplyr package itself. Something like this:
# A tibble: 20 x 4
# Groups: id [2]
id name val group_no
<fct> <fct> <dbl> <int>
1 a N1 0.647 1
2 a N1 0.530 1
3 a N1 0.245 1
4 a N2 0.693 2
5 a N2 0.478 2
6 a N2 0.861 2
7 a N3 0.821 3
8 a N3 0.0995 3
9 a N3 0.662 3
10 b N1 0.553 1
11 b N1 0.0233 1
12 b N1 0.519 1
13 b N2 0.783 2
14 b N2 0.789 2
15 b N2 0.477 2
16 b N2 0.438 2
17 b N2 0.407 2
18 b N3 0.732 3
19 b N3 0.0707 3
20 b N3 0.316 3
Note, that the values of name
could be anything and certainly are not normally suffixed by a number as in the example (otherwise I could do sub("^N", "", df$name)
.
I am looking for something a little different than the 1:n()
solution in SO posts such as here.