I am trying to calculate one-step-ahead forecasts using the so called MIDAS concept. Within this concept one calculates forecasts in dependence of a higher-frequency data. For example, the dependent variable y
could be yearly recorded and be explained with the help of an independent variable x
, which could be sampled, for example, quarterly.
There is a package called midasr
which offers a lot of functions. I can calculate the one-step-ahead forecasts using the function select_and_forecast
from the mentioned package as follows (with simulated data, which is a simplified version of the example form the user's guide to the package midasr
):
Generation of the data:
library(midasr)
set.seed(1001)
n <- 250
trend <- c(1:n)
x <- rnorm(4 * n)
z <- rnorm(12 * n)
fn.x <- nealmon(p = c(1, -0.5), d = 8)
y <- 2 + 0.1 * trend + mls(x, 0:7, 4) %*% fn.x + rnorm(n)
Calculation of forecasts (out-of-sample forecast horizon is controlled by the argument outsample
, so in my example I am calculating 10 forecasts, from 240 to 250)
select_and_forecast(y~trend+mls(y,1,1,"*")+mls(x,0,4),
from=list(x=c(4)),
to=list(x=rbind(c(14,19))),
insample=1:250,outsample=240:250,
weights=list(x=c("nealmon","almonp")),
wstart=list(nealmon=rep(1,3),almonp=rep(1,3)),
IC="AIC",
seltype="restricted",
ftype="recursive",
measures=c("MSE"),
fweights=c("EW","BICW")
)$forecasts[[1]]$forecast
What I would like to do now is to simulate a situation where a new value of the higher-frequency variable becomes available, because, for example, a new month has passed and the value for this month can be used in the model. I would proceed as follows, but am very unsure if it is correct:
select_and_forecast(y~trend+mls(y,1,1,"*")+mls(x,0,4),
from=list(x=c(3)), # The only change is the reduction of the lower bound of the range of the lags of the regeressor from 4 to 3
to=list(x=rbind(c(14,19))),
insample=1:250,outsample=240:250,
weights=list(x=c("nealmon","almonp")),
wstart=list(nealmon=rep(1,3),almonp=rep(1,3)),
IC="AIC",
seltype="restricted",
ftype="recursive",
measures=c("MSE"),
fweights=c("EW","BICW")
)$forecasts[[1]]$forecast
Theoretically one includes the new observations of the higher-frequency variable through reduction of the time index, but I don't know if using the function this way is correct.
This question is for someone who is familiar with the package. Can someone give a comment to this?
The formula I think on is:
y_t=\beta_0 + \beta_1B(L^{1/m};\theta)x_{t-h+1/m}^{(m)} + \epsilon_t^{(m)}
With h=1
in my case and adding 1/m
to include a new high-frequency observation