I have the following R code:
library(lubridate)
library(vars)
library(tidyverse)
library(Metrics)
library(quantmod)
df = read.csv('oil-data.csv')
## Horizon is the number of months in advance we're trying to predict
horizon = 3
forecast_date <- ymd(df$date)
month(forecast_date) <- month(forecast_date) + horizon
forecast <- data.frame(
forecastDate = forecast_date,
actual = df$actual
)
varStart <- 72
VarData = df[, c("delta_production", "rea", "wti", "delta_inventories")]
## deflating the oil cost by the log of US CPI
VarData$wti = log(VarData$wti)
var_forecast <- rep(NA, nrow(df) - varStart + 1)
for (i in varStart:nrow(df)) {
## p value taken from Kilian and Murphy (2013)
varModel <- VAR(VarData[1:i, ], p = 12, type="const")
temp_var_forecast <- predict(varModel, n.ahead = horizon)
oil_h <- exp(temp_var_forecast["fcst"]$fcst$wti[1])
var_forecast[i - varStart + 1] <- oil_h
}
forecast = forecast[varStart:nrow(df),]
forecast$predicted = var_forecast
forecast$error <- mse(forecast$actual, forecast$predicted)
print(min(forecast$error))
print(mean(forecast$error))
print(max(forecast$error))
When I run this, the resulting forecast$error
value is the same for every value (~40.23398). This seems weird to me, because at each iteration, I am fitting a new VAR model, on a (slightly) different data set.
Why is this? Am I doing something wrong here?
I am trying to replicate model 2.1 from this paper. I can provide more code/data if needed.
The oil-data.csv
file can be downloaded here to reproduce the example.
Currently, the min, mean, and max are all equal to 40.22983.