1

I have Date (column B) and Total (column A) variables - how can I create a new variable in R that sums the previous seven days' worth of Totals?

In Excel, I have the following formula:

=SUMIFS($A:$A,$B:$B, ">="&$B20-7,$B:$B,"<"&$B20)

and I just don't know how to convert this to work in R. Suggestions?

pnuts
  • 58,317
  • 11
  • 87
  • 139
user4100980
  • 127
  • 1
  • 1
  • 13
  • I'm not familiar with Excel, but I think you might be looking for the `rollsum` function in the `zoo` package. – Alex Woolford May 31 '15 at 13:04
  • @AlexWoolford: good thought, but that will only work if their data has exactly one observation for every day. `zoo::window` might be more appropriate. – Joshua Ulrich May 31 '15 at 13:19
  • The data does only contain one observation for each day. I'm looking at the documentation for `rollsum` though, and I'm still not clear on how to include the criteria to sum only the previous seven days. – user4100980 May 31 '15 at 14:58

2 Answers2

2

This will do it too, advanced, but short - essentially a one-liner.

# Initialze some data
date <- seq(as.Date("2001-01-01"),as.Date("2001-01-31"),"days")
tot <- trunc(rnorm(31,100,20))
df <- data.frame(date,tot)

# Now compute week sum by summing a subsetted df for each date
df$wktot <- sapply(df$date,function(x)sum(df[difftime(df$date,x,,"days") %in% 0:-6,]$tot))

Changed the variable names to match the posed problem.

It also handles the data in any order and multiple entries per day.

Edited to add comments and make it fit in a window.

Mike Wise
  • 22,131
  • 8
  • 81
  • 104
  • Changed the variable names and fixed a typo. – Mike Wise May 31 '15 at 15:24
  • great answer. voted up – Pierre L May 31 '15 at 15:40
  • All in one line. I wrote one that groups the days, but it's great that it could be done so succinctly. Still trying to unpack your code :) – Pierre L May 31 '15 at 15:50
  • In practice it might be better to subset the vector instead of the whole dataframe. Would be faster for big dataframes.... Now that I think of it. – Mike Wise May 31 '15 at 15:54
  • I added a grouping function. It works, but very wordy. If I could incorporate `difftime` somehow, I might be able to shorten. And yours doesn't return NAs for groups less than the rolling sum group. It just adds the days up to the ith element. I didn't consider that. – Pierre L May 31 '15 at 15:57
  • Note also that this answer is far more flexible than the "duplicate" link and requires no extra libraries. – Mike Wise Nov 10 '17 at 10:56
1

If there is one total per day, this function may help:

rollSums <- function(totals, roll) {
  res <- c()
  for(i in 1:(length(totals)-roll)) {
    res <- c(res, sum(totals[0:(roll-1)+i]))
  }
  res
}

df1
   Total       Date
1      3 2015-01-01
2      8 2015-01-01
3      4 2015-01-02
4      7 2015-01-03
5      6 2015-01-04
6      1 2015-01-04
7     10 2015-01-05
8      9 2015-01-06
9      2 2015-01-07
10     5 2015-01-08

rollSums(df1$Total, 3)
[1] 15 19 17 14 17 20 21

rollSums(df1$Total, 4)
[1] 22 25 18 24 26 22

It will take two arguments, the vector with the totals and how many days you'd like in each sum.

Data

dput(df1)
structure(list(Total = c(3L, 8L, 4L, 7L, 6L, 1L, 10L, 9L, 2L, 
5L), Date = structure(c(16436, 16436, 16437, 16438, 16439, 16439, 
16440, 16441, 16442, 16443), class = "Date")), .Names = c("Total", 
"Date"), row.names = c(NA, -10L), class = "data.frame")

Update

In case you run into a situation with multiple values on the same day, here's a solution. Surprisingly, @MikeWise has a one-liner that can do all of this. See other answer.

grouped.roll <- function(DF, Values, Group, roll) {
  totals <- eval(substitute(with(DF, tapply(Values, Group, sum))))
  newsums <- rollSums(totals, roll)
  data.frame(Group=names(totals), Sums=c(rep(NA, roll), newsums))
}

It uses the rollSums that I used earlier. It will spit out NAs until the desired day grouping begins. That may be the only advantage over the other answer. But they could easily edit that in, I'm sure. Just providing more options for reference.

grouped.roll(df1, Total, Date, 3)
       Group Sums
1 2015-01-01   NA
2 2015-01-02   NA
3 2015-01-03   NA
4 2015-01-04   22
5 2015-01-05   18
6 2015-01-06   24
7 2015-01-07   26
8 2015-01-08   21
Pierre L
  • 28,203
  • 6
  • 47
  • 69