0

This question comes from a previous one I posted a while ago:

rollsum with fixed dates

I can not make the given solution to work. I have a large data set, the interesting columns are:

id = c(145658, 145658, 145658, 145658, 145658, 145658, 145658, 145658, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3)

week_number = c(24, 35, 44, 71, 82, 117, 127, 142, 4, 15, 20, 24, 30, 36, 42, 46, 59, 67, 68, 71, 75, 78, 79, 86, 93, 96)

amount = c(51.9, 51.9, 51.9, 51.9, 51.9, 103.8, 51.9, 51.9, 67.9, 67.9, 67.9, 67.9, 67.9, 67.9, 67.9, 67.9, 67.9, 67.9, 101.0, 168.9, 101.0, 101.0, 135.8, 168.9, 168.9, 67.9)

df = data.frame(id = id, week_number = week_number, amount = amount)

In reality, I have thousands of id's, and each has different week number. I want to calculate the rollsum on the "amount" column for n past weeks (including the present week) for each id.

An extreme example would be with the past 100 weeks. The results would look like:

past_100wk = c(NA, NA, NA, NA, NA, 363.3, 363.3, 363.8, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)

Again, this is an extreme case, but it shows the the results should give NA (or -1) when the row value is not included in the week_number window (100 weeks, in this case).

Thank you!

Community
  • 1
  • 1
Andres
  • 281
  • 2
  • 13
  • Your tags don't seem to fit the content. Maybe you meant to tag with [tag:zoo] instead of those? – Frank Jan 11 '16 at 20:47
  • @Frank Thanks! I added zoo, but if I am not mistaken, zoo accepts dates. I not sure, that is why I left it out. – Andres Jan 11 '16 at 20:52
  • You are asking about *adaptive moving sum*, adaptive because the moving period is non constant but comes from a vector parameter. You can find benchmark of few *adaptive moving functions* in [this SO](http://stackoverflow.com/q/21368245/2490497), keep in mind that Rcpp answer currently with the highest number of votes do not address the question. – jangorecki Jan 11 '16 at 22:23
  • If I misunderstood your question and the moving window width is a constant scalar parameter, then check `TTR::runSum`. – jangorecki Jan 11 '16 at 22:27
  • @jangorecki ok, thanks! Will check that out. – Andres Jan 12 '16 at 05:41
  • @jangorecki Yes, it is constant (either 10, 12...). The only thing is that the week_number variable is not the same for each user. – Andres Jan 12 '16 at 06:42
  • I found this post that applies to my case: http://stackoverflow.com/questions/19546182/r-sum-by-based-on-date-range. However, if my window period is larger than, say, the first two rows (in my example, week 24 and 35), it will give the sum of those two rows anyway, i.e., if the window is 40 weeks, for week 35 the sum should be NA (or -1 or 0), since there are not 40 weeks backwards, just 35. How could I achieve this? – Andres Jan 12 '16 at 09:01
  • @Andres I would try to change `na.rm` in the mentioned solution. The `runSum` seems to handle that as you expect: `TTR::runSum(1:5, 3) #[1] NA NA 6 9 12` – jangorecki Jan 12 '16 at 13:18

0 Answers0