I am trying to use parsnip
to specify a recipe to fit an xgboost poisson regression model with a log offset. To set-up a poisson regression I can specify an option in set_engine
, which works nicely:
# Specify recipe
my_recipe <- recipe(training_df, Count ~.) %>%
# Remove covariates that are 80% correlated
step_corr(all_predictors(), threshold = 0.8) %>%
step_center(all_predictors(), -all_outcomes()) %>%
step_scale(all_predictors(), -all_outcomes())))
# Specify xgboost config
tune_spec <- boost_tree(
trees = 100) %>%
set_engine("xgboost", objective='count:poisson') %>%
set_mode("regression") %>%
translate()
Looking at the documentation for xgboost and this example here it seems that the following approach is recommended for specifying an offset:
setinfo(xgtrain, "base_margin", log(training_df$my_offset))
I'm not sure how to include this into set_engine
above. Specifically, I'm not sure how to relate xgtrain
to the dataframe training_df
.