0

My reproducible example is as follows;

please do not bother at all the underlying meaning of the calculations (none, actually) because it is just an extract of my real dataset;

train <- structure(list(no2 = c(25.5, 31.2, 33.4, 29.9, 31.8),
                        vv_scal = c(1.3, 1.3, 0.8, 1.1, 0.9), 
                        temp = c(-0.7, -2, 1.5, 0.4, 1.1), 
                        prec = c(0, 11, 9, 3, 0), 
                        co = c(1.6, 2.9, 3.2, 2.6, 3)), 
                        row.names = c(NA, -5L), 
                        class = c("tbl_df", "tbl", "data.frame"))


test <- structure(list(no2 = c(41.6, 41.4, 46.6, 44.7, 43.2), 
                       vv_scal = c(1.2, 1.2, 1.2, 1, 1), 
                       temp = c(0.9, 1, 0.1, 1.6, 3.8), 
                       prec = c(0, 0, 0, 0, 0), 
                       co = c(4.3, 4.3, 4.9, 4.7, 4.5)), 
                       row.names = c(NA, -5L), 
                       class = c("tbl_df", "tbl", "data.frame"))
                       
                       

forest_ci <- function(B, train_df, test_df, var_rf){
  
  # Initialize a matrix to store the predicted values
  predictions <- matrix(nrow = B, ncol = nrow(test_df))
  
  # bootstrapping predictions
  for (b in 1:B) {
    
    # Fit a random forest model
    model <- randomForest::randomForest(var_rf~., data = train_df) # not working
    #model <- randomForest::randomForest(no2~., data = train_df)   # working
    
    # Store the predicted values from the resampled model
    predictions[b, ] <- predict(model, newdata = test_df)
    
  }
  
  predictions
  
}

predictions <- forest_ci(B=2, train_df=train, test_df=test, var_rf = no2)

I've got the following error message:

Error in eval(predvars, data, env) : object 'no2' not found

I think understanding the error has somehow to do with the concept of "non-standard evaluation" and the "capturing expressions"

http://adv-r.had.co.nz/Computing-on-the-language.html

Following the suggestion of some threads, here follows some of them:

how do I pass a variable name to an argument in a function

Passing a variable name to a function in R

I've been trying the use of different combinations of the functions: substitute(), eval(), quote() but without much success;

I know the subject has already been covered here but I could not find a proper solution so far;

my objective is to pass the name of a variable inside a function argument to be evaluated inside the regression (and prediction) provided by the Random Forest model

Thanks

maxbre
  • 161
  • 9

1 Answers1

1

Try using ensym() and inject() from rlang:

forest_ci <- function(B, train_df, test_df, var_rf){
  
  y = rlang::ensym(var_rf)
  
  # Initialize a matrix to store the predicted values
  predictions <- matrix(nrow = B, ncol = nrow(test_df))
  
  # bootstrapping predictions
  for (b in 1:B) {
    
    # Fit a random forest model
    model <- rlang::inject(randomForest::randomForest(!!y~., data = train_df)) # not working
    #model <- randomForest::randomForest(no2~., data = train_df)   # working
    
    # Store the predicted values from the resampled model
    predictions[b, ] <- predict(model, newdata = test_df)
    
  }
  
  predictions
  
}
langtang
  • 22,248
  • 1
  • 12
  • 27
  • yes, it works, thank you! For myself, there is still a lot to be discovered about this "new" subject! In fact, I was wondering about an R base solution; is there an equivalent for that? For example, why some combination of eval() and substitute() is apparently failing in this case? Can you point me to some good references to get much deeper into the subject? – maxbre Dec 11 '22 at 15:43
  • 3
    @maxbre: The base R equivalent would be `eval(bquote(…))`. You can use `.()` to force evaluation inside `bquote()`. – TimTeaFan Dec 11 '22 at 17:20