I am trying to write a function that will spit out model diagnostic plots.
to_plot <- function(df, model, response_variable, indep_variable) {
resp_plot <-
df %>%
mutate(model_resp = predict.glm(model, df, type = 'response')) %>%
group_by(indep_variable) %>%
summarize(actual_response = mean(response_variable),
predicted_response = mean(model_resp)) %>%
ggplot(aes(indep_variable)) +
geom_line(aes(x = indep_variable, y = actual_response, colour = "actual")) +
geom_line(aes(x = indep_variable, y = predicted_response, colour = "predicted")) +
ylab(label = 'Response')
}
When I run this over a dataset, dplyr throws an error that I don't understand:
fit <- glm(data = mtcars, mpg ~ wt + qsec + am, family = gaussian(link = 'identity')
to_plot(mtcars, fit, mpg, wt)
Error in grouped_df_impl(data, unname(vars), drop) :
Column `indep_variable` is unknown
Based on some crude debugging, I found that the error happens in the group_by step, so it could be related to how I'm calling the columns in the function. Thanks!