I use speedglm
to fit a GLM to data. When I call the function directly, the code works as expected, but when I create a function to fit the model, I get an error that an argument is not found.
The variable (w
in the example below) clearly exists in the scope of the function but it seems that the variable is evaluated only later within the speedglm function where w is no longer available or so I think. This is where I start questioning my current understanding of R.
Did I make an error while creating the function, does speedglm use some weird trick to scope the variable (source code here) that breaks the normal (?) logic or do I have a wrong understanding of how R functions work?
I am trying to understand this behavior and also fix my train_glm
function to make it work with speedglm and weights.
MWE
library(speedglm)
# works as expected
m1 <- speedglm(wt ~ cyl, data = mtcars, weights = mtcars$wt)
# define a small helper function that just forwards its arguments
train_glm <- function(f, d, w) {
speedglm(formula = f, data = d, weights = w)
}
# does not work
m <- train_glm(wt ~ cyl, d = mtcars, w = mtcars$wt)
#> Error in eval(extras, data, env) : object 'w' not found
Even weirder, if I change the code I found the following
# removing the weights as a base case -> WORKS
train_glm3 <- function(f, d) {
speedglm(formula = f, data = d)
}
m3 <- train_glm3(wt ~ cyl, d = mtcars)
# works
# hardcoding the weights inside the function -> BREAKS
train_glm4 <- function(f, d) {
speedglm(formula = f, data = d, weights = d$wt)
}
m4 <- train_glm4(wt ~ cyl, d = mtcars)
# Error in eval(extras, data, env) : object 'd' not found
# creating a new dataset and hardcoding the weights inside the function
# but using the name of the dataset at the highest environment -> WORKS
train_glm5 <- function(f, d) {
speedglm(formula = f, data = d, weights = mtcars2$wt)
}
mtcars2 <- mtcars
m5 <- train_glm5(wt ~ cyl, d = mtcars2)
# works