I am conducting an analysis where I am calculating a rate based partly on a rolling sum of days. I am performing this calculation using dplyr and group_by/summary/mutate operations.
However, the increments of the rolling sum of days differs by group. Ideally, I have a measurement every 30 days for a year. However, sometimes the measurements come 60 or 90 days apart.
For example:
df <- data.frame( ID = "Subject A",
cumulative_days = c(30, 60, 90, 180, 270, 360),
rolling_percent = c(.8, .6, .6, .4, .3, .2))
I want to turn this group in something like:
result <- data.frame(ID = "Subject A",
month = seq(1,12),
rolling_percent = c(.8, .6, .6, NA, NA, .4, NA, NA, .3, NA, NA, .2))
If I am able to reach the 'result' dataframe above, my plan is to leverage the dplyr/zoo solution described here: fill in NA based on the last non-NA value for each group in R
Where I can fill in the NAs with the last non-NA observation.
In other words, I want to be able to turn N observations which cumulatively add up to a rolling sum of 360 into 12 observations. At that point, I believe I can apply the other linked solution to solve my problem.
I'm struggling to clearly describe this situation, so any advice on clarifying my problem would be appreciated.