I have a dataset with 69 columns and over 50000 rows which is structured like this:
Some of the columns can only take 0 or 1 values (binary), for example:'isFemale', 'isChild', etc.
Some other columns can only take 0 or 1 values (binary) but are exclusive. For example, I have 3 columns called 'Primary.Language.ENGLISH', 'Primary.Language.SPANISH', 'Primary.Language.OTHER'. These columns are exclusive, so I can only have one of them True.
.
Primary.Language.ENGLISH Primary.Language.SPANISH Primary.Language.OTHER
1 0 0
0 1 0
I cannot have this (can't have more than one True in the same row)
Primary.Language.ENGLISH Primary.Language.SPANISH Primary.Language.OTHER
1 1 0
Both types of columns have NAs (about 4-5%) and I was thinking of performing imputation with mice package in R. However, I am afraid that, for the second type, I will have problems since imputation could not respect the constraint that I discussed above (can't have more than one '1' in the same row for each type of column of that type). Do you have any suggestions on how I could achieve it?