I'm not sure what is the reason behind this.
I have a data set with 107 variables (mixed of numeric and factor data types) and some of them contain missing values. I use mice to impute the data.
MICE imputed data of most of all variables. However, some variable are not imputed at all.
It is very strange that while some variables are successfully imputed, some are not. I also tried running MICE just on only the variables which did not successfully imputed, this time, it was successful.
What is the reason behind this? Does it has anything to do with the number of variables in my data set? How can I fix this or do I need to run mice separately for each variable?
Many thanks,
Edited I now give out the code to replicate what I meant.
> #create data set with NAs
> iris.fake = prodNA(iris, noNA = 0.9)
> iris.fake.miss <- aggr(iris.fake)
> iris.fake.miss$missings
Variable Count
Sepal.Length Sepal.Length 138
Sepal.Width Sepal.Width 137
Petal.Length Petal.Length 138
Petal.Width Petal.Width 131
Species Species 131
>
> #run mice
> imp = mice(iris.fake, m = 5, maxit = 5)
iter imp variable
1 1 Sepal.Width Petal.Length Petal.Width Species
1 2 Sepal.Width Petal.Length Petal.Width Species
1 3 Sepal.Width Petal.Length Petal.Width Species
1 4 Sepal.Width Petal.Length Petal.Width Species
1 5 Sepal.Width Petal.Length Petal.Width Species
2 1 Sepal.Width Petal.Length Petal.Width Species
2 2 Sepal.Width Petal.Length Petal.Width Species
2 3 Sepal.Width Petal.Length Petal.Width Species
2 4 Sepal.Width Petal.Length Petal.Width Species
2 5 Sepal.Width Petal.Length Petal.Width Species
3 1 Sepal.Width Petal.Length Petal.Width Species
3 2 Sepal.Width Petal.Length Petal.Width Species
3 3 Sepal.Width Petal.Length Petal.Width Species
3 4 Sepal.Width Petal.Length Petal.Width Species
3 5 Sepal.Width Petal.Length Petal.Width Species
4 1 Sepal.Width Petal.Length Petal.Width Species
4 2 Sepal.Width Petal.Length Petal.Width Species
4 3 Sepal.Width Petal.Length Petal.Width Species
4 4 Sepal.Width Petal.Length Petal.Width Species
4 5 Sepal.Width Petal.Length Petal.Width Species
5 1 Sepal.Width Petal.Length Petal.Width Species
5 2 Sepal.Width Petal.Length Petal.Width Species
5 3 Sepal.Width Petal.Length Petal.Width Species
5 4 Sepal.Width Petal.Length Petal.Width Species
5 5 Sepal.Width Petal.Length Petal.Width Species
> summary(imp)
Multiply imputed data set
Call:
mice(data = iris.fake, m = 5, maxit = 5)
Number of multiple imputations: 5
Missing cells per column:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
138 137 138 131 131
Imputation methods:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
"pmm" "pmm" "pmm" "pmm" "polyreg"
VisitSequence:
Sepal.Width Petal.Length Petal.Width Species
2 3 4 5
PredictorMatrix:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
Sepal.Length 0 0 0 0 0
Sepal.Width 0 0 1 1 1
Petal.Length 0 1 0 1 1
Petal.Width 0 1 1 0 1
Species 0 1 1 1 0
Random generator seed value: NA
>
> com = complete(imp,2)
> iris.imp.miss <- aggr(com)
> iris.imp.miss$missings
Variable Count
Sepal.Length Sepal.Length 138
Sepal.Width Sepal.Width 0
Petal.Length Petal.Length 0
Petal.Width Petal.Width 0
Species Species 0