6

I want to calculate the rolling mean for all variables in column "sp". This is a sample of my data:

the_date    sp  wins
01-06--2012 1   305
02-06--2012 1   276
03-06--2012 1   184
04-06--2012 1   248
05-06--2012 1   243
06-06--2012 1   363
07-06--2012 1   272
01-06--2012 2   432
02-06--2012 2   369
03-06--2012 2   302
04-06--2012 2   347
05-06--2012 2   357
06-06--2012 2   331
07-06--2012 2   380
01-06--2012 3   1
02-06--2012 3   2
03-06--2012 3   3
04-06--2012 3   2
05-06--2012 3   0
06-06--2012 3   2
07-06--2012 3   0

What I want, is to have a column added to data, that gives the moving average over 3 days for each sp. So the following output is what I desire:

the_date    sp  wins    SMA_wins
01-06--2012 1   305     305.00
02-06--2012 1   276     290.50
03-06--2012 1   184     255.00
04-06--2012 1   248     236.00
05-06--2012 1   243     225.00
06-06--2012 1   363     284.67
07-06--2012 1   272     292.67
01-06--2012 2   432     432.00
02-06--2012 2   369     400.50
03-06--2012 2   302     367.67
04-06--2012 2   347     339.33
05-06--2012 2   357     335.33
06-06--2012 2   331     345.00
07-06--2012 2   380     356.00
01-06--2012 3   1       1.00
02-06--2012 3   2       1.50
03-06--2012 3   3       2.00
04-06--2012 3   2       2.33
05-06--2012 3   0       1.67
06-06--2012 3   2       1.33
07-06--2012 3   0       0.67

I am using rollapply.

df <- group_by(df, sp)
df_zoo <- zoo(df$wins, df$the_date) 
mutate(df, SMA_wins=rollapplyr(df_zoo, 3, mean,  align="right", partial=TRUE))

If I filter my data on a specific sp, it works perfectly.

How can I make this work when I group by sp?

Thanks

asiehh
  • 553
  • 12
  • 22

1 Answers1

9

You can do it like this:

library(dplyr)
library(zoo)

df %>% group_by(sp) %>%
       mutate(SMA_wins=rollapplyr(wins, 3, mean, partial=TRUE))

It looks like your use of df and df_zoo in your mutate call was messing things up.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
jeremycg
  • 24,657
  • 5
  • 63
  • 74
  • Thank you @jeremycg. It gives the correct result. However, it functions independently of "the_date" column. Basically it takes the samples sequentially. It works if my input data is sorted by date but what if it is not? – asiehh Nov 18 '15 at 00:52
  • Add in an `arrange(the_date)` – jeremycg Nov 18 '15 at 01:03
  • @jeremycg, do you know how to apply this to my question here? My situation may be different since the dates are spaced irregularly: https://stackoverflow.com/questions/50023898/compute-a-rolling-weighted-sum-by-group – Bobby Apr 25 '18 at 22:25
  • @Bobby by the time I've seen this you had an answer in the comments, so I reopen voted it – jeremycg Apr 26 '18 at 20:24
  • @jeremycg Is it possible to use rollapply to apply a function to group ? I have Q [here](https://stackoverflow.com/questions/65998470/apply-a-custom-function-on-grouped-dataframe-n-rows-at-a-time) can you have a look at it ? – user5249203 Feb 01 '21 at 20:35