I am trying to impute values using a linear model using mice. My understanding of mice is that it iterates over the rows. For a column with NAs it is using all other columns as predictors, fits the model, and then samples from this model to fill up the NAs. Here is an example where I generate some data, and than introduce missing data using ampute.
n <- 100
xx<-data.frame(x = 1:n + rnorm(n,0,0.1), y =(1:n)*2 + rnorm(n,0,1))
head(xx)
res <- (ampute(xx))
head(res$amp)
The missing data looks like:
x y
1 NA 3.887147
2 2.157168 NA
3 2.965164 6.639856
4 3.848165 8.720441
5 NA 11.167439
6 NA 12.835415
Then I am trying to impute the missing data:
mic <- mice(res$amp,diagnostics = FALSE )
And I would expect that then there is non, but there are NA always in one of the columns.
colSums(is.na(complete(mic,1)))
And in which of the two it is rather random.
By running the code above I am getting:
> colSums(is.na(complete(mic,1)))
x y
0 30
but also :
> colSums(is.na(complete(mic,1)))
x y
33 0