0

Yes, there are several questions similar to this, but none of them involve multiple grouped variables, which causes the solutions for those questions to not work properly. The best similar questions I can find are:

  1. Numbering rows within groups in a data frame, and
  2. Create a sequential number (counter) for rows within each group of a dataframe [duplicate]

Dummy data for my case:

library(dplyr)
df <- tibble(
    s1 = c("111", "111", "111", "112", "112", "114", "114", "115"),
    s2 = c(rep("A", 5), rep("B", 3)),
    val = rnorm(8)
)

I want to provide a grouping ID for s1 within group s2. That is, I want it to reset each time s2 changes. Desired output:

# A tibble: 8 x 4
  s1    s2        val    id
  <chr> <chr>   <dbl> <dbl>
1 111   A     -0.465      1
2 111   A      0.871      1
3 111   A      0.823      1
4 112   A      0.561      2
5 112   A      0.197      2
6 114   B     -0.743      1
7 114   B      0.0847     1
8 115   B     -1.05       2

One of the suggested solutions for similar questions is

library(dplyr)
df %>% group_by(s1) %>% mutate(id = row_number())

but that resets each time s1 changes. Similarly, these did not work either:

df %>% group_by(s1, s2) %>% mutate(id = row_number())
df %>% group_by(s2) %>% mutate(id = row_number())
df %>% group_by(s1) %>% mutate(id = row_number(s2))
df %>% group_by(s1) %>% mutate(id = cur_group_id())
df %>% group_by(s1, s2) %>% mutate(id = cur_group_id())
Earlien
  • 145
  • 9
  • `df %>% group_by(s2) %>% mutate(id = match(s1, unique(s1)))` – Ritchie Sacramento Jun 30 '22 at 01:44
  • Thanks, that works. Never would have thought of this. – Earlien Jun 30 '22 at 01:49
  • 1
    @RitchieSacramento Why not post an answer? I suppose someone might be able to find a duplicate but I don't this this solution is obvious and deserves an answer rather than a comment. – IRTFM Jun 30 '22 at 02:06
  • Two other potential solutions: `df %>% group_by(s2) %>% mutate(id = cumsum(s1 != lag(s1, default = first(s1))) + 1)` and `df %>% group_by(s2) %>% mutate(id = cumsum(!duplicated(s1)))`, although RitchieSacramento's answer is what I would use – jared_mamrot Jun 30 '22 at 03:41

0 Answers0