0

I am using the mice package to impute data, and I have read about post processing to restrict imputed values. One of the variables that I am imputing categorical variable with 10 different levels (a,b,c,d,e,f,g,h,i,j). The missing values can take everything as value except a and d. I need to make it so people with category a or d have values of NA after the imputation. Because when I'm imputing now, people are imputed based on all the available levels and that is wrong.

I have also tried to create another binary variable that says actually 0 and 1 in order to make it work but it still imputed in the wrong way.

Any ideas about post processing this in mice in R?

Alexia S
  • 23
  • 3
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Sep 24 '19 at 17:52
  • I don't know that I understand the question completely but here is my suggestion. Use indicator (dummy) codes for each level of the categorical variable (9 binary variables with one reference category). Then exclude the two variables `a` and `d` from the imputation. You can see how to do this in the tutorial here (https://datascienceplus.com/handling-missing-data-with-mice-package-a-simple-approach/) [example is `meth[c("Age")]=""`]. – TJ87 Sep 24 '19 at 17:52
  • @TJ87 by using dummy variables, one record can belong to more than one categories, when imputation is finished. – Wietze314 Sep 25 '19 at 07:47
  • I think the issue here is that the missing data mechanism is MNAR, i.e. category `a` and `d` are never missing, so missingness is based on the value of the variable itself. With post processing you could say that whenever a missing variable is imputed to level `a` or `d`, another default category, for example `b` will be chosen. More info on conditional imputation: https://stefvanbuuren.name/fimd/sec-knowledge.html#conditional-imputation – Wietze314 Sep 25 '19 at 08:03
  • Also similar issue discussed on cross-validated: https://stats.stackexchange.com/questions/32179/how-to-impute-an-ordinal-variable-with-mice-but-prevent-it-from-taking-one-value – Wietze314 Sep 25 '19 at 08:11
  • Hi all, thanks for trying to help me. I am not sure how to produce an example but the dataset look like that – Alexia S Sep 25 '19 at 09:33
  • The issue is that missing values in that categorical variable are imputed as a and d but I want them to accept everything except those two categories – Alexia S Sep 25 '19 at 09:50

0 Answers0