This question is similar to dplyr/ R cumulative sum with reset, which asked for a way to reset a cumulative summation based on a threshold value. The accepted answer to that question was a function that applies a fixed threshold for resetting the accumulation.
library(tidyverse)
sum_reset_at <- function(thresh) {
function(x) {
accumulate(x, ~if_else(.x >= thresh, .y, .x + .y))
}
}
df <- tibble(a = c(2, 3, 1, 2, 2, 3))
df %>% mutate(c = sum_reset_at(5)(a))
## # A tibble: 6 x 2
## a c
## <dbl> <dbl>
## 1 2 2
## 2 3 5
## 3 1 1
## 4 2 3
## 5 2 5
## 6 3 3
When the accumulation reaches (or exceeds) the threshold value it starts again from the value of a
in the next record.
Rather than using a fixed threshold, I would like to provide a vector of thresholds that would be accessed sequentially, incrementing with each reset:
thresholds <- c(5, 3, 2)
df %>% mutate(c = sum_reset_at(thresholds)(a))
## # A tibble: 6 x 2
## a c
## <dbl> <dbl>
## 1 2 2
## 2 3 5
## 3 1 1
## 4 2 3
## 5 2 2
## 6 3 3
The vector would be recycled as required.
I have something working using sample
in the function:
set.seed(0)
sum_reset_at <- function(thresh) {
function(x) {
accumulate(x, ~if_else(.x >= sample(thresh, size = 1), .y, .x + .y))
}
}
thresholds <- c(5, 3, 2)
df %>% mutate(c = sum_reset_at(thresholds)(a))
## # A tibble: 6 x 2
## a c
## <dbl> <dbl>
## 1 2 2
## 2 3 3
## 3 1 4
## 4 2 2
## 5 2 4
## 6 3 3
But I don't want to randomly sample the thresholds, I want to sequentially sample them.