0

Just new to R and programming in general; I'm trying to count the number of rows observed with value > N within each group until values >= are observed, at which point, the count resets.

data <- tibble(id=1:10, 
  val=c(1,5,6,1,2,1,300,4,1,300), 
  group=c(1,1,1,3,2,1,1,1,1,3))

What my table looks like:
 id                                  val                             group
 1                                     1                                 1
 2                                     5                                 1
 3                                     6                                 1
 4                                     1                                 3
 5                                     2                                 2
 6                                     1                                 1
 7                                   300                                 1
 8                                     4                                 1
 9                                     1                                 1
10                                   300                                 3

For example, I want to count the number of rows until I see values > 100 and then reset the count:

 id                                  val                             group          count
 1                                     1                                 1              0
 2                                     5                                 1              1
 3                                     6                                 1              2
 4                                     1                                 3              0
 5                                     2                                 2              0
 6                                     1                                 1              3
 7                                   300                                 1              4
 8                                     4                                 1              0
 9                                     1                                 1              1
10                                   300                                 3              1

I've thought about adding an extra column and summing it up until the condition is observed, but I don't think that works the way I expect it to.

data_count <- mutate(data, c=1)
data_count <- grouped_by(data_count) %>% mutate(count=ifelse(val>100, cumsum(c), 0)

     id                                  val                             group          count
     1                                     1                                 1              1
     2                                     5                                 1              1
     3                                     6                                 1              1
     4                                     1                                 3              1
     5                                     2                                 2              1
     6                                     1                                 1              1
     7                                   300                                 1              1
     8                                     4                                 1              1
     9                                     1                                 1              1
    10                                   300                                 3              1

Thanks!

MrFlick
  • 195,160
  • 17
  • 277
  • 295
Wen
  • 1
  • Possible duplicate: https://stackoverflow.com/questions/32994060/r-cumulative-sum-by-condition-with-reset – MrFlick Nov 21 '19 at 16:07
  • Possible duplicate: https://stackoverflow.com/questions/32247414/create-sequential-counter-that-restarts-on-a-condition-within-panel-data-groups – MrFlick Nov 21 '19 at 16:08
  • Where do you get `grouped_by` from? – camille Nov 21 '19 at 16:13
  • 2
    Does this answer your question? [Create sequential counter that restarts on a condition within panel data groups](https://stackoverflow.com/questions/32247414/create-sequential-counter-that-restarts-on-a-condition-within-panel-data-groups) – camille Nov 21 '19 at 16:13

1 Answers1

0

Building off the previous answers, you can do

data %>% 
  group_by(group) %>% 
  mutate(idx=lag(cumsum(val >100),1, 0)) %>% 
  group_by(group, idx) %>% 
  mutate(counter=row_number()-1) %>% 
  ungroup() %>% 
  select(-idx)

which returns

      id   val group counter
   <int> <dbl> <dbl>   <dbl>
 1     1     1     1       0
 2     2     5     1       1
 3     3     6     1       2
 4     4     1     3       0
 5     5     2     2       0
 6     6     1     1       3
 7     7   300     1       4
 8     8     4     1       0
 9     9     1     1       1
10    10   300     3       1
MrFlick
  • 195,160
  • 17
  • 277
  • 295