0

Reprex

set.seed(4)
df <- data.frame(week = rep(rep(c(1,2), each=5), times=98),
                 day = rep(c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday"), times = 98),
                 hours = round(runif(490, min=0,max=10)))
head(df, n=12L)
   week       day hours
1     1    Monday     6
2     1   Tuesday     0
3     1 Wednesday     3
4     1  Thursday     3
5     1    Friday     8
6     2    Monday     3
7     2   Tuesday     7
8     2 Wednesday     9
9     2  Thursday     9
10    2    Friday     1
11    1    Monday     8
12    1   Tuesday     3

Problem

Been struggling for some time now to create a cumulative sum of variable hours above that resets every new week, using R Studio. The only solutions I've found here involve reset conditions that fall short in my case, as random daily fluctuations in variable hours cause any threshold I set to be reached earlier or later than intended, in at least some weeks. For example:

sum_reset_at <- function(thresh) {
  function(x) {
    accumulate(x, ~if_else(.x==thresh, .y, .x+.y))
  }
}

df %>% mutate(test = sum_reset_at(20)(hours)) %>% head(n=15L)
   week       day hours test
1     1    Monday     6    6
2     1   Tuesday     0    6
3     1 Wednesday     3    9
4     1  Thursday     3   12
5     1    Friday     8   20
6     2    Monday     3    3
7     2   Tuesday     7   10
8     2 Wednesday     9   19
9     2  Thursday     9   28
10    2    Friday     1    1
11    1    Monday     8    9
12    1   Tuesday     3   12
13    1 Wednesday     1   13
14    1  Thursday    10   23
15    1    Friday     4    4

Instead, is it possible to reset the cumulative sum at every instance of "Monday" in variable day, or every 'n' times? Or would it make sense to create a new iterative week_itera variable (i.e. increases to '3', then '4', rather than cycling back to '1' and '2', etc.), then adapt the solution posted here to add together hours for each week?

thelatemail
  • 91,185
  • 12
  • 128
  • 188
SourceCoda
  • 105
  • 9
  • If it should be on every 'Monday', do a grouping i.e. `df %>% group_by(grp = cumsum(day == "Monday")) %>% mutate(test = sum_reset_at(20)(hours)) %>% head(15)` or if it `n` rows, `n <- 10; df %>% group_by(grp = as.integer(gl(n(), n, n()))) %>% mutate(test = sum_reset_at(20)(hours))` – akrun Apr 06 '22 at 20:58

1 Answers1

0

The first version resets every 5 rows, the second resets every Monday:

df$test1 <- ave(df$hours, (seq_along(df$hours)-1)%/%5 , FUN=cumsum )

or

df$test2 <- ave(df$hours, cumsum(df$day=="Monday") , FUN=cumsum )


> df
    week       day hours test1 test2
1      1    Monday     6     6     6
2      1   Tuesday     0     6     6
3      1 Wednesday     3     9     9
4      1  Thursday     3    12    12
5      1    Friday     8    20    20
6      2    Monday     3     3     3
7      2   Tuesday     7    10    10
8      2 Wednesday     9    19    19
9      2  Thursday     9    28    28
10     2    Friday     1    29    29
11     1    Monday     8     8     8
12     1   Tuesday     3    11    11
13     1 Wednesday     1    12    12
14     1  Thursday    10    22    22
15     1    Friday     4    26    26
George Savva
  • 4,152
  • 1
  • 7
  • 21