1

My panel dataset contains 7 variables and 1452 observations, covering 6 years. I would like to regress y on x, while controlling for the other variables. The data contains quite a lot of missing observations, 35 % for the independent variable, x, and 23 % for the dependent one, y.

a, b, and c also contain missings but not to that extent.

A toy dataset looks like this:

Name Year id y x a b c
A 2015 1 6 n.a. 9 4 1
A 2016 1 n.a. 2 9 3 n.a.

I used multiple imputation as provided by the mice function, which worked well. Diagnostics of the distributions of the imputed datasets also seem to be okay. Here is my code (excluding the diagnostics):

predictormatrix<-quickpred(data, 
                           include=c("a", "b", "c", "x", "y"),
                           exclude=c("Name", "Year", "id"),
                           mincor = 0.1)

imp <- mice(data, 
            predictorMatrix = predictormatrix,
            m=5,
            maxit=5,
            meth='pmm')


I can manage to conduct a simple pooled regression:

fitimp <- with(imp,
               lm(y ~ x + a + b + c)) 

summary(pool(fitimp))

However, as a pooled OLS does not take into account the structure of the panel data, I would like to fit a fixed effects and a random effects model and decide on a model on the basis of the Hausman test. I tried using the with function like this:

fitimp.fe <- with(imp,
                  plm(y ~ x + a + b + c),
                  data = imp,
                  index = c("Name", "Year"),
                  effect = "individual", model = "within") 

summary(pool(fitimp.fe))

But it gives me an error: No tidy method for objects of class mids. Plus a warning: Infinite sample size assumed.

Apart from fitting a fixed and random effects model to the imputed datasets, I do not know how to compare them (as mentioned, e.g., on the basis of a Hausman test). Can this be done with the with function?

I've been trying to solve this for quite some time now and would be very grateful if someone could help me. I found a lot about imputation for multilevel data, but if I understood it correctly, this does not apply to my dataset. Last but not least, I've read multiple times to install broom.mixed, which didn't help.

  • It looks like you have a few errors in your `with()` function call, including misplaced parenthesis/commas for the `plm()` function. One of those errors is a bit tricky, because you need to capture the data from mice into a data.frame and then pass it into `plm()`. You can see an example of how to do that in the first part of [this answer](https://stackoverflow.com/questions/72514230/is-it-possible-to-use-lqmm-with-a-mira-object/72521307#72521307). After you fix those errors it should work. – DavidLukeThiessen Jun 09 '22 at 04:48
  • You don't need to worry about installing `broom.mixed` or loading `broom`, since the default `mice` package already loads the `tidy.plm()` and `glance.plm()` methods. – DavidLukeThiessen Jun 09 '22 at 04:48
  • @DavidLukeThiessen Thank you, that worked perfectly fine! Do you have any idea regarding my second question, which was how to compare fixed and random effects models that are stored as mipo objects? – user19153338 Jun 09 '22 at 13:11
  • I'm not familiar with that sort of comparison. I usually use one of the `D1() D2() D3()` functions for model comparison, but don't know about the theory for random effect models. Depending on what you want to know it might be better to ask on Cross Validated for that. – DavidLukeThiessen Jun 09 '22 at 18:34
  • Do read the documentation for package `plm`, esp. the introductory vignette https://cran.r-project.org/web/packages/plm/vignettes/A_plmPackage.html . Or search the reference manual for "Hausman". You will find `plm::phtest` for plm models. – Helix123 Jun 10 '22 at 12:05

0 Answers0