My task is to create many models, choose model that predict best and pass data to this model for prediction. Example inspired from R for data science book
library(modelr)
library(tidyverse)
library(gapminder)
gapminder
country_model1 <- function(df) {lm(lifeExp ~ year, data = df)}
country_model2 <- function(df) {lm(lifeExp ~ year+gdpPercap, data = df)}
country_model3 <- function(df) {lm(lifeExp ~ year+gdpPercap+pop, data = df)}
by_country <- gapminder %>%
group_by(country, continent) %>%
nest() %>%
mutate(model1 = map(data, country_model1),
model2 = map(data, country_model2),
model3 = map(data, country_model3))
So I have 3 models for each country. I can find r squared for each model, but stopped here :(
r_sq <- by_country %>%
mutate(glance1 = map(model1, broom::glance),
glance2 = map(model2, broom::glance),
glance3 = map(model3, broom::glance)) %>%
unnest(glance1:glance3, .drop = TRUE) %>%
select(country, continent, starts_with('r.sq'))
How to in tidy way:
- select which of 3 make better prediction for each particular country?
- pass new data to chosen model and have prediction back?