I'm working with a team and some people are using SPSS to replace missing case (multiple imputation) and, then, analyze the data. When SPSS impute new values, it reports every dataset result and a pooled result, that is different than the mean of all results.
Now, I'm using R to work on this "multiple imputation dataset" created on SPSS. I'm trying to obtain the pooled estimates from a regression in the same way SPSS reports. Grace to this post here, I can use broom package to run several regression models and show each estimate. The problem: some statistics are way different. For example, t value are higher when using broom than what was reported by SPSS. Please, take a look to this SPSS output.
In order to make this job easy, let's say I want to regress price on stars for each group and, after, display a row with pooled estimates (t test result and p-value).
code:
library(broom)
d <- data.frame(group=rep(1:5, each = 20),
price=rnorm(100,1000,10),
stars=rnorm(100,3,1))
fitted_models <- d %>%
group_by(group) %>%
do(model = lm(price ~ stars, data = .))
fitted_models %>% tidy(model)
fitted_models %>% glance(model)
In case you want to better understand what SPSS is doing, please, check this real output. The pooled is not the mean of all results.
Please, feel free to say this question is irrelevant, but don't negative this post. Other people can have the same question and I provide all codes to you run again the analysis.
Thanks much