I am analyzing a repeated-measures data set (continuous variable "y" assessed at 4 timepoints; factor "time" (4 levels), covariates "cov1", "cov2", "cov3" assessed at baseline, ID as subject identifier). Missing data (~14%) is only evident in "y".
I have used the "mice" and "miceadds" packages to impute the data, which works. However, I am not sure whether I have specified the predictor matrix correctly.
The linear mixed model that will be tested after imputation is:
lmer(y ~ time + cov1 + cov2 + cov3 + (1|ID))
.
So I tried to set up a predictor matrix to use the same predictors for imputation. Only the dependent variable "y" should be imputed using all the factors and covariates.
Is this captured by the following predictor matrix?
ID cov1 cov2 cov3 time y
ID 0 0 0 0 0 0
cov1 0 0 0 0 0 0
cov2 0 0 0 0 0 0
cov3 0 0 0 0 0 0
time 0 0 0 0 0 0
y -2 1 1 1 1 0
Further, I have read about the different imputation methods "2l.lmer" and "2l.pan" (for instance here) but I am not sure which of these to use. The results differ only very slightly.
Do you have any recommendations regarding the predictor matrix or "2l.lmer"/"2l.pan"? Any help is appreciated!
Thanks a lot!
Katarina