I'm performing predictive analysis where I train a model to a portion of my data and test the model with the remaining portion. I'm familiar with the MICE package and the imputation procedure using predictive mean matching.
My understanding is that the proper way to utilize imputation is to create numerous imputed data sets, fit a model to each of those imputed data sets, then combine the coefficients across all of those fitted models into one single model. I know how to do this and view the summary of the coefficients with which I can perform inference on the variables. However, that is not my objective; I need to end up with a single model that I can use to predict new values.
Simply put, when I try to use the predict function with this model I got from using MICE, it doesn't work.
Any suggestions? I am coding this in R.
Edit: using the airquality
data set as an example, my code looks like this:
imputed_data <- mice(airquality, method = c(rep("pmm", 6)), m = 5, maxit = 5)
model <- with(imputed_data, lm(Ozone ~ Solar.R + Wind + Temp + Month + Day))
pooled_model <- pool(model)
This gives me a pooled model across my 5 imputed data sets. However, I am unable to use the predict function with this model. When I then execute:
predict(pooled_model, newdata = airquality)
I get this error:
Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "c('mira', 'matrix')"