For some website data, I have a dataframe with a column of dates and a column of visitors. I wish to add some rolling aggregate columns based on the day of the week.
I'm trying to aggregate (mean, median, sum, count) by day of the week, so day x
would be grouped with dayx - 7
, day x - 14
... day x - 7*n
where n
is the number of weeks required in the window, where the min data in the data will go back further than 7*n
.
For example, if the last 5 Fridays saw visitor levels of 100, 110, 120, 130, 160
then then entry against the most recent friday for median value over 3 weeks would be 130
, and 136.67
for mean over past 3 Fridays.
Sample dataset in:
structure(list(visit_date = structure(1:20, .Label = c("01-01-16",
"01-02-16", "01-03-16", "01-04-16", "01-05-16", "01-06-16", "01-07-16",
"01-08-16", "01-09-16", "01-10-16", "01-11-16", "01-12-16", "01-13-16",
"01-14-16", "01-15-16", "01-16-16", "01-17-16", "01-18-16", "01-19-16",
"01-20-16"), class = "factor"), visitors = c(114L, 158L, 153L,
157L, 192L, 128L, 197L, 146L, 123L, 127L, 170L, 126L, 106L, 112L,
119L, 184L, 186L, 171L, 183L, 125L)), .Names = c("visit_date",
"visitors"), class = "data.frame", row.names = c(NA, -20L))
ideal output for sum()
newdf <- structure(list(visit_date = structure(1:20, .Label = c("01-01-16",
"01-02-16", "01-03-16", "01-04-16", "01-05-16", "01-06-16", "01-07-16",
"01-08-16", "01-09-16", "01-10-16", "01-11-16", "01-12-16", "01-13-16",
"01-14-16", "01-15-16", "01-16-16", "01-17-16", "01-18-16", "01-19-16",
"01-20-16"), class = "factor"), visitors = c(114L, 158L, 153L,
157L, 192L, 128L, 197L, 146L, 123L, 127L, 170L, 126L, 106L, 112L,
119L, 184L, 186L, 171L, 183L, 125L), sum_visitors = c(NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 379L, 465L, 466L,
498L, 501L, 359L)), .Names = c("visit_date", "visitors", "sum_visitors"
), class = "data.frame", row.names = c(NA, -20L))
I've looked at rollapply
but am unsure how to roll this out by row in the data frame.
Hope this makes sense, thanks in advance.