I am trying to write a code that creates a new column with moving averages based on 'year' where the number of rows for each year is variable where each year only has one unique value being repeated for each row of that year. I want to calculate moving averages based these unique values independent of the number of rows per year.
Just a FYI; I'm very new to R and programming so if I missed something for you to understand my problem please let me know.
For example, the type of data I'm working with looks like this:
df <- data.frame(year = c(1702, 1702, 1702, 1702, 1702, 1703, 1703, 1703, 1704, 1704, 1705, 1705, 1705, 1705, 1706, 1706, 1707, 1707, 1707, 1708, 1708, 1708, 1708, 1708, 1709, 1709, 1709, 1709, 1709), avgtemp = c(5.3, 5.3, 5.3, 5.3, 5.3, 3.9, 3.9, 3.9, 6.12, 6.12, 4.16, 4.16, 4.16, 4.16, 5.65, 5.65, 3.11, 3.11, 3.11, 5.17, 5.17, 5.17, 5.17, 5.17, 4.75, 4.75, 4.75, 4.75, 4.75))
I found this post, Moving Average by Unique Date with multiple observations per date, and tried the solution offered there by Mark Peterson but it doesnt work for me.
I've tried the following code.
rolledavg <-
df %>%
group_by(year) %>%
summarise(rollavg = mean(avgtemp)) %>%
ungroup() %>%
arrange(year) %>%
mutate( ma3temp = rollapply(rollavg
, 3
, mean
, align= "right"
, partial=T
, fill = NA))
I get the following error: "Error in order(year) : argument 1 is not a vector".
The expected output should be something like this:
I would appreciate any help I can get. Don't mind working with other packages/solutions than the one offered above.