0

I am trying to run VAR-models between multiple variables, which is why I wrote a function to prepare the data and execute the vars::VAR() function, feeding the result to the vars::irf() function to generate an IRF model where I subseqyently extract relevant metrics. This function has an argument max_lag, in which I define the maximum number of lags to consider when fitting the model.

Hence I define a function like this:

library(vars)
data(EuStockMarkets)
f1 <- function(max_lag = 6){
  
  vars::irf(vars::VAR(EuStockMarkets, lag.max = max_lag))
  
}

I then call the function:

f1(max_lag = 12)

which results in

Error in VAR(y = ysampled, lag.max = max_lag) : 
  object 'max_lag' not found

Note that neither f1() nor assigning a new object m <- vars::VAR(EuStockMarkets, lag.max = max_lag) and then providing this as input to the irf() function, nor pre-loading the package solve the issue.

This seems to be an issue specific to the vars-package, as both

f2 <- function(test = "test"){
  
  print(paste(test))
  
}

f2(test = "TEST")
[1] "TEST"

and

f3 <- function(col = "red"){
  
  ggplot(starwars, aes(x = height, y = mass)) +
    geom_point(color = col)
  
}

f3(col = "blue")

Output f3(col = "blue")

work.

I also moved out of my comfort zone and tried to use get('max_lag') with the arguments parent.frame(), environment(), parent.env(environment()), or calling a pre-defined environment (which is also not found), following this answer.

Suggestions both welcome regarding specific fix and general issue - I feel there is something substantial about R environments going on here.

DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25

1 Answers1

0

The problem wasn't actually that VAR() couldn't find max_lag, it's that when you called irf() on the VAR() object, the value max_lag didn't follow along, so it was actually irf() that couldn't find max_lag. You can solve this by creating a set of arguments that includes lag.max=max_lag and then calling VAR() on that with do.call(). Here's an example:

library(vars)
data(EuStockMarkets)
  f1 <- function(max_lag = 6){
  args <- list(y=EuStockMarkets, 
               lag.max=max_lag)
  m <- do.call(vars::VAR, args)
  vars::irf(m)
}
res <- f1()
names(res$irf)
#> [1] "DAX"  "SMI"  "CAC"  "FTSE"

res$irf$DAX
#>            DAX      SMI      CAC     FTSE
#>  [1,] 32.05836 29.44197 19.20616 20.35056
#>  [2,] 31.60745 32.37784 19.13026 21.64319
#>  [3,] 30.63930 32.70562 18.66182 21.25947
#>  [4,] 29.73668 31.81143 17.25273 20.64492
#>  [5,] 28.81030 29.08016 15.49562 19.60930
#>  [6,] 26.72902 26.68544 13.52721 18.63569
#>  [7,] 26.30296 26.08855 13.49620 18.25619
#>  [8,] 25.95382 25.70311 13.40278 17.88850
#>  [9,] 25.74752 25.64168 13.36974 17.49341
#> [10,] 25.72072 25.92978 13.38728 17.23849
#> [11,] 25.67485 26.12543 13.41721 17.01522

Created on 2023-04-05 with reprex v2.0.2

DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25
  • Amazing, thank you! I must admit I still don't fully understand - wouldn't irf() take 'max_lag' from the original function environment? what's the difference with do.call()/args? – Nicolai Berk Apr 05 '23 at 15:06
  • What's happening is that `irf()` calls `.boot()` and inside that function it tries to do `update(VAR, y=ysampled)`, when you specify VAR(y, lag.max=max_lag), R doesn't replace the value `max_lag` with a number, `VAR()` just goes to the parent environment to find it. The problem is the parent environment for `.boot()` is not the global environment, but the environment created by `irf()`, so `max_lag` isn't there. – DaveArmstrong Apr 05 '23 at 16:33
  • When you make a list of arguments, the list doesn't have a pointer to `max_lag`, it has the actual numerical value in it. So, when `irf()` calls `.boot()` and `.boot()` uses `update(VAR, ...)`, the numerical value of `lag.max` is already there, not a pointer to an object that cannot be found. – DaveArmstrong Apr 05 '23 at 16:35
  • Thank you! I did not know that nested functions exist in different environments, this is quite interesting! – Nicolai Berk Apr 06 '23 at 15:36