I am doing a simulation study and one of the datasets I am imputing is very small (n=10). When using MICE, my dataset and code are as follows
> dat
y X1 X2
11 -155.04185 NA 10.464688
12 69.02116 NA 8.245312
13 -89.18124 21.69072 4.717425
14 115.52205 NA 15.666802
15 94.09654 NA 6.977855
16 65.44607 NA 16.608755
17 -246.09192 NA 3.208590
18 118.99815 25.68459 4.727989
19 214.84858 NA 6.065670
20 293.19425 NA 6.647658
> pred1 <-matrix(data= c(0,0,0,
1,0,1,
0,0,0), nrow = 3, ncol = 3, byrow = TRUE)
> mice(dat, m=25, method= "norm", predictorMatrix = pred1, maxit=5)
iter imp variable
1 1 X1_missing
Error in cor(xobs[, keep, drop = FALSE], use = "all.obs") : 'x' is empty
For another dataset which has 3 observed values for X1, the mice command worked fine with no errors.
I have looked up the error and came across these two links which have not helped: https://stat.ethz.ch/pipermail/r-help/2015-December/434914.html
Unclear error with mice package
I have looked at the following code in github https://github.com/stefvanbuuren/mice/blob/master/R/internal.R
I have determined that 'x' is the design matrix which is used to impute the variable with missing observations. (found the definitions in this link: https://stat.ethz.ch/pipermail/r-help/2015-December/434914.html)
In my case the design matrix should consist of 'y' and 'X2' which I have specified in pred1
to help impute 'X1'. Given that 'y' and 'X2' are fully observed in the data, I am not sure why it thinks the design matrix is empty.
Would anyone have any ideas as to what is going wrong?
UPDATE:
After updating the mice
package to version 3.4.0 the imputations ran for the data fold but it logged a number of events and output the following error message
it im dep meth out
1 1 1 X1_missing norm df set to 1. # observed cases: 2 # predictors: 3
2 1 1 X1_missing norm All predictors are constant or have too high correlation.
3 1 2 X1_missing norm df set to 1. # observed cases: 2 # predictors: 3
4 1 2 X1_missing norm All predictors are constant or have too high correlation.
5 1 3 X1_missing norm df set to 1. # observed cases: 2 # predictors: 3
6 1 3 X1_missing norm All predictors are constant or have too high correlation.
So the issue is to do with the small number of observations and the number of predictors I am using resulting in negative degrees of freedom. In the following link (https://stefvanbuuren.name/fimd/sec-toomany.html#finding-problems-loggedevents) it states that the degrees of freedom are being set to 1 implying predictors are being dropped.
Therefore, I may need to tweak my simulated data to get around this.