I'm a bit of a r newbie, and have am a little stuck at the way forward to run a correlation on time-series data where the second vector is much longer and I want to run a rolling time window.
My data looks something like this :
set.seed(1)
# "Target sample" (this is always of known fixed length N, e.g. 20 )
target <- data.frame(Date=rep(seq(Sys.Date(),by="1 day",length=20)),Measurement=rnorm(2))
# "Potential Sample" (this is always much longer and of unknown length,e.g. 730 in this example)
potential <- data.frame(Date=rep(seq(Sys.Date()-1095,by="1 day",length=730)),Measurement=rnorm(2))
What I would like to do is take a rolling window of size N (i.e matching the size of target sample), incrementing the roll by one day at a time, and then print two columns for each window :
WindowStartDate and the result of cor(target,potentialWindow)
So in pseudo-code (using the generated example above) :
- Start at Sys.Date()-1095, take window size N values
- Print (or,probably better, put in to new data frame) Sys.Date()-1095 and result of cor(target,potentialWindow)
- Roll forward +1 day to Sys.Date()-1094 , take window size N values
- Print (or, probably better, put in to new data frame) Sys.Date()-1094 and result of cor(target,potentialWindow)
- etc. etc.
N.B. The roll forward +1 day is obviously a variable that could be tweaked depending on desired overlap.