Problem
I get an error when running predict
in the tidymodels framework.
The error appears to be related to selecting variables in the recipe (see code below).
What I've tried
There are some related SO posts, such as this one, this one, or this one, but they seem to deal with different issues (such as manipulating the outcome in the recipe).
However, I would like to understand why my code throws the error in the first place.
Code that throws error
library(tidyverse)
library(tidymodels)
data("mtcars")
d_train <- mtcars %>% slice(1:20)
d_test <- mtcars %>% slice(21:nrow(mtcars))
preds_chosen <- c("hp", "disp", "am")
rec1 <-
recipe( ~ ., data = d_train) %>%
step_select(all_of(preds_chosen), mpg) %>%
update_role(all_of(preds_chosen), new_role = "predictor") %>%
update_role(mpg, new_role = "outcome")
model_lm <- linear_reg()
wf1 <-
workflow() %>%
add_model(model_lm) %>%
add_recipe(rec1)
lm_fit1 <-
wf1 %>%
fit(d_train)
preds <-
lm_fit1 %>%
predict(d_test)
#> Error in `dplyr::select()`:
#> ! Can't subset columns that don't exist.
#> ✖ Column `mpg` doesn't exist.
Possible solution
If I change the recipe in the following ways, the whole code runs without an error:
rec2 <- recipe(mpg ~ hp + disp + am, data = d_train)
rec3 <-
recipe(mpg ~ ., data = d_train) %>%
update_role(all_predictors(), new_role = "id") %>%
update_role(all_of(preds_chosen), new_role = "predictor") %>%
update_role(mpg, new_role = "outcome")
SessionInfo
sessionInfo()
#> R version 4.1.3 (2022-03-10)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur/Monterey 10.16
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.13 knitr_1.39 magrittr_2.0.3 rlang_1.0.2
#> [5] fastmap_1.1.0 fansi_1.0.3 stringr_1.4.0 styler_1.5.1
#> [9] highr_0.9 tools_4.1.3 xfun_0.30 utf8_1.2.2
#> [13] cli_3.3.0 withr_2.5.0 htmltools_0.5.2 ellipsis_0.3.2
#> [17] yaml_2.3.5 digest_0.6.29 tibble_3.1.7 lifecycle_1.0.1
#> [21] crayon_1.5.1 purrr_0.3.4 vctrs_0.4.1 fs_1.5.2
#> [25] glue_1.6.2 evaluate_0.15 rmarkdown_2.14 reprex_2.0.1
#> [29] stringi_1.7.6 compiler_4.1.3 pillar_1.7.0 backports_1.4.1
#> [33] pkgconfig_2.0.3
Created on 2022-05-21 by the reprex package (v2.0.1)
```