My question is similar to this one, but now I am trying to use a model with multiple predictors and I can't figure out how to get the newdata into the predict function.
library(dplyr)
library(lubridate)
library(purrr)
library(tidyr)
library(broom)
set.seed(1234)
First I create a seq of weeks
wks = seq(as.Date("2010-01-01"), Sys.Date(), by="1 week")
Then I grab the current year
cur_year <- year(Sys.Date())
Here I create a data frame with dummy data
my_data <- data.frame(
week_ending = wks
) %>%
mutate(
ref_period = week(week_ending),
yr = year(week_ending),
PCT.EXCELLENT = round(runif(length(wks), 0, 100),0),
PCT.GOOD = round(runif(length(wks), 0, 100),0),
PCT.FAIR = round(runif(length(wks), 0, 100),0),
PCT.POOR = round(runif(length(wks), 0, 100),0),
PCT.VERY.POOR = round(runif(length(wks), 0, 100),0),
pct_trend = round(runif(length(wks), 75, 125),0)
)
Next I create a nested dataframe that has the data for each week of the year as one group.
cond_model <- my_data %>%
filter(yr != cur_year) %>%
group_by(ref_period) %>%
nest(.key=cond_data)
Here I join this year's data back into the previous years' data by week of the year.
cond_model <- left_join(
cond_model,
my_data %>%
filter(yr==cur_year) %>%
select(week_ending,
ref_period,
PCT.EXCELLENT,
PCT.FAIR,
PCT.GOOD,
PCT.POOR,
PCT.VERY.POOR),
by = c("ref_period")
)
And this adds the linear model to the data frame for each week of the year
cond_model <-
cond_model %>%
mutate(model = map(cond_data,
~lm(pct_trend ~ PCT.EXCELLENT + PCT.GOOD + PCT.FAIR + PCT.POOR + PCT.VERY.POOR, data = .x)))
now I would like to use the model for each week to predict using this year's data. I tried the following:
cond_model <-
cond_model %>%
mutate(
pred_pct_trend = map2_dbl(model, PCT.EXCELLENT + PCT.GOOD + PCT.FAIR + PCT.POOR + PCT.VERY.POOR,
~predict(.x, newdata = data.frame(.y)))
)
That gives the following error:
Error in mutate_impl(.data, dots) : object 'PCT.EXCELLENT' not found
I then tried nesting my predictors in my data frame...
create a data frame with just this year's data and nest the predictors
cur_cond <- my_data %>%
filter(yr==cur_year) %>%
select(week_ending, PCT.EXCELLENT,
PCT.GOOD, PCT.FAIR, PCT.POOR, PCT.VERY.POOR) %>%
group_by(week_ending) %>%
nest(.key=new_data) %>%
mutate(new_data=map(new_data, ~data.frame(.x)))
join this into my main data frame
cond_model <- left_join(cond_model, cur_cond)
Now I try the prediction again:
cond_model <-
cond_model %>%
mutate(
pred_pct_trend = map2_dbl(model, new_data,
~predict(.x, newdata = data.frame(.y)))
)
I get the same error as before:
Error in mutate_impl(.data, dots) : object 'PCT.EXCELLENT' not found
I think that the answer could involve performing a flatten() on the predictors, but I can't figure out where that goes in my workflow.
cond_model$new_data[1]
vs.
flatten_df(cond_model$new_data[1])
and at this point I have run out of ideas.