I'm trying to make predictions with my testing data using my finalized workflow. But whenever I try using the predict function, it gives me this error:
Error in `step_log()`:
! The following required column is missing from `new_data` in step 'log_79Q8u': shares.
The shares variable is present in my testing dataset.
Do I need to cahnge my recipe and retune my model?? This is for my final and I really need to resolve this error would appreciate any advice!!
My code for the recipe and the prediction is below:
# recipe
recipe_kc <- recipe(shares ~ ., data = articles_train) %>%
step_log(shares) %>%
step_normalize(all_numeric_predictors()) %>%
step_zv(all_predictors())
# selecting best model
best_workflow <- bt_tuned %>%
extract_workflow_set_result("recipe3_bt") %>%
select_best(metric = "rmse", "rsq")
best_workflow
final_workflow <- bt_tuned %>%
extract_workflow("recipe3_bt") %>%
finalize_workflow(best_workflow)
final_fit <- fit(final_workflow, articles_train)
# using testing data
final_pred <- articles_test %>%
select(shares) %>%
bind_cols(predict(final_fit, new_data = articles_test)) %>%
mutate(
.pred_log = .pred,
.pred = 10^.pred_log
) %>%
summarize(.pred, shares, shares_log,.pred_log)