1

I have a df with an ID column and another column (Base/Promo) coding 1 or 0. I'd like to create a third, new column that contains a counter that counts if value is 0 in (Base/Promo) and gets reset to 0 if the value in (Base/Promo) is 1 and starts counting again when the value is 0. Also, the counter should be reset whenever the ID changes. Here is a reproducible df & an example of how the output should look like:

df <- data.frame(ID = c(rep("1", 6), rep("2", 6), 
rep("3", 6) rep("4", 6)), 
response=c(0,0,0,1,0,0,1,0,0,0,0,1,0,0,0,1,1,0,0,0,0,1,1,0))             
df

Desired outcome:

      ID     Base/Promo     counter
1     1         0           1
2     1         0           2
3     1         0           3
4     1         1           0
5     1         0           1
6     1         0           2
7     2         1           0
8     2         0           1
9     2         0           2
10    2         0           3
11    2         0           4
12    2         1           0
13    3         0           1
14    3         0           2
15    3         0           3
16    3         1           0
17    3         1           0
18    3         0           1
19    4         0           1
20    4         0           2
21    4         0           3
22    4         1           0
23    4         1           0
24    4         0           1

I tried with the following code:

df <- df %>%
group_by(ID, idx = cumsum("Base/Promo" == 1L)) %>%
mutate(counter= row_number()) %>%
ungroup %>%
  select(-idx)

But this gives me the following outcome:

      ID     Base/Promo     counter
1     1         0           1
2     1         0           2
3     1         0           3
4     1         1           1
5     1         0           2
6     1         0           3
7     2         1           1
8     2         0           2
9     2         0           3
10    2         0           4
11    2         0           5
12    2         1           1
13    3         0           2
14    3         0           3
15    3         0           4
16    3         1           1
17    3         1           1
18    3         0           2
19    4         0           1
20    4         0           2
21    4         0           3
22    4         1           1
23    4         1           1
24    4         0           2

I hope you can help with this problem, as I'm out of ideas of what else to try. In case I should have overlooked an SO entry that solves my problem, it would be great if you could point me there. Many thanks!!!

Henrik
  • 65,555
  • 14
  • 143
  • 159
  • [This answer](https://stackoverflow.com/questions/5012516/create-counter-within-consecutive-runs-of-certain-values/5019695#5019695) is a beauty. Just wrap it in your favorite by-group code. – Henrik Jun 27 '20 at 10:07

2 Answers2

0

We can adjust the counter values based on first value of the group :

library(dplyr)

df %>%
  group_by(ID, grp = cumsum(response == 1L)) %>%
  mutate(counter = if(first(response) == 1L) row_number() - 1 
                   else row_number()) %>%
  ungroup() %>%
  dplyr::select(-grp)

# A tibble: 24 x 3
#   ID    response counter
#   <chr>    <dbl>   <dbl>
# 1 1            0       1
# 2 1            0       2
# 3 1            0       3
# 4 1            1       0
# 5 1            0       1
# 6 1            0       2
# 7 2            1       0
# 8 2            0       1
# 9 2            0       2
#10 2            0       3
# … with 14 more rows
     
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

You can use rle to generate the sequence and multiply the result with the negated response.

df %>%
  group_by(ID) %>%
  mutate(x = with(rle(response), sequence(lengths)) * !response)

# A tibble: 24 x 3
# Groups:   ID [4]
   ID    response     x
   <chr>    <dbl> <int>
 1 1            0     1
 2 1            0     2
 3 1            0     3
 4 1            1     0
 5 1            0     1
 6 1            0     2
 7 2            1     0
 8 2            0     1
 9 2            0     2
10 2            0     3
# ... with 14 more rows
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56