0

I have a df like this

a   b
1   3
2   3
3   3
4   3
1   3
2   3
3   3
4   3

I want the solution like this. I want the moving sum of column 'a' for the 3 intersections which is the value from column b. Values of column b can change.

Result

a   b   CumSum
1   3   1
2   3   3
3   3   6
4   3   9
1   3   8
2   3   7
3   3   6
4   3   9

below is the code I am using

df$cum_sum <- ave(a, 
              (seq_along(a)-1)%/%
                b,
              FUN= cumsum)

but this restarts the cumulative sum on the 4th observation which is not the expected result.

  • Is this not a duplicate of https://stackoverflow.com/q/67268621/680068 ? – zx8754 Apr 13 '23 at 11:50
  • This is also relevant. https://stackoverflow.com/questions/37773356/find-sum-of-previous-n-rows-in-dataframe – benson23 Apr 13 '23 at 11:53
  • Hi @benson23, This is really helpful but I do not need the NA's instead I would want to start the sum from the first row itself. Could yoy please help in here – Alisha Dcosta Apr 13 '23 at 12:10
  • Sorry for missing the `NA`s. I believe [this answer](https://stackoverflow.com/a/37255956/16647496) solves your problem. i.e. `df %>% mutate(CumSum = rollapplyr(a, b, sum, partial = TRUE))` – benson23 Apr 13 '23 at 12:23
  • Hi @jpsmith, 8, 7 6 are nothing but the sum of past 3 rows of column a starting from the current row. 8 = 1 + 4 + 3 – Alisha Dcosta Apr 13 '23 at 12:25

1 Answers1

0

You can utilise data.table::frollsum()'s adaptive argument to construct a partial rolling sum together with some code I've written to construct the partial window.

# Constructs moving window sizes
window_seq <- function (k, n, partial = TRUE){
  stopifnot(length(k) == 1L, length(n) == 1L)
  if (n > .Machine[["integer.max"]]) {
    stop("n must not be greater than .Machine$integer.max")
  }
  n <- as.integer(n)
  k <- as.integer(k)
  k <- min(k, n)
  k <- max(k, 0L)
  pk <- max(k - 1L, 0L)
  p_seq <- seq_len(pk)
  out <- rep_len(k, n)
  if (partial) {
    out[p_seq] <- p_seq
  }
  else {
    out[p_seq] <- NA_integer_
  }
  out
}

partial_sum <- function(x, n, ...){
  data.table::frollsum(x, n = window_seq(n, length(x)), adaptive = TRUE, ...)
}

df1 <- structure(list(a = c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), b = c(3L, 
                                                                   3L, 3L, 3L, 3L, 3L, 3L, 3L)), class = "data.frame", row.names = c(NA, 
                                                                                                                                     -8L))

partial_sum(df1$a, df1$b[1])
#> [1] 1 3 6 9 8 7 6 9

Created on 2023-04-13 with reprex v2.0.2

NicChr
  • 858
  • 1
  • 9