I'm using dbplyr to access data in Redshift tables and Tidyverse to do the data wrangling. I'm trying to calculate a rolling sum amount over 7 days.
Unfortunately, none of the packages that I've found suggested on stack overflow, such as 'slider' and 'runner', seem to be compatible with dbplyr (or with the Redshift tables and sql that I'm using).
I'm trying to achieve similar results to this table. In this example, the value in the cum_sum_7Days
column is the sum of values in the amount
column where the start_date
fits between the start_date
and previous_7Day
columns.
Acct Start_Date Previous_7Day Amount Cum_sum_7Days
YYYY 8/07/2022 7:04 1/07/2022 7:04 500 500
YYYY 8/07/2022 12:49 1/07/2022 12:49 200 700
YYYY 9/07/2022 11:47 2/07/2022 11:47 300 1000
YYYY 9/07/2022 11:52 2/07/2022 11:52 45.6 1045.6
YYYY 12/07/2022 13:03 5/07/2022 13:03 200 1245.6
YYYY 15/07/2022 13:53 8/07/2022 13:53 200 745.6
YYYY 16/07/2022 12:58 9/07/2022 12:58 300 700
YYYY 16/07/2022 13:28 9/07/2022 13:28 500 1200
YYYY 19/07/2022 12:22 12/07/2022 12:22 200 1400
YYYY 23/07/2022 5:52 16/07/2022 5:52 200 1200
YYYY 26/07/2022 13:01 19/07/2022 13:01 100 300
YYYY 29/07/2022 13:50 22/07/2022 13:50 200 500
YYYY 30/07/2022 13:57 23/07/2022 13:57 300 600
YYYY 3/08/2022 6:17 27/07/2022 6:17 200 700
YYYY 5/08/2022 13:30 29/07/2022 13:30 200 900
YYYY 9/08/2022 13:44 2/08/2022 13:44 200 600
YYYY 12/08/2022 12:13 5/08/2022 12:13 200 600
Note that:
- My dates are not consecutive
- The date-time fields are required as the 7 days must be accurate to the hour-minute
- The rolling window size may be changed from 7 days (14 days, 1 year, etc.)
So, any solution need to handle this.