1

When I try to impute missing values for my data, the mice function will not pick up the columns that has missing values. I am trying to replace the null values in BusinessTravel but using the below code, it is not working

    library(mice)

    mice_mod <- mice(my_ca_dataset[, !names(my_ca_dataset) %in% 
    c('EmployeeCount', 
'JobInvolvenent', 'NumCompaniesWorked', 'TrainingTimesLastYear')], method = 'rf')

    iter imp variable
1 1
1 2
1 3
1 4
1 5
2 1
2 2
2 3
2 4
2 5
3 1
3 2
3 3
3 4
3 5
4 1
4 2
4 3
4 4
4 5
5 1
5 2
5 3
5 4
5 5
Warning message:
Number of logged events: 4 

mice_output <- complete(mice_mod)
my_ca_dataset$BusinessTravel <- mice_output$BusinessTravel 

sapply(my_ca_dataset, function(x) sum(is.na(x)))


Age Attrition BusinessTravel DailyRate 
0 0 31 0 
Department EmployeeCount EnvironmentSatisfaction Gender 
0 36 0 0 
HourlyRate JobInvolvement JobLevel MonthlyRate 
0 0 0 0 
NumCompaniesWorked Over18 StandardHours StockOptionLevel 
45 0 0 0 
TotalWorkingYears TrainingTimesLastYear YearsAtCompany YearsWithCurrManager 
0 0 0 0

The way we need it to work is

 iter imp variable
1 1 BusinessTravel
1 2 BusinessTravel
1 3 BusinessTravel
1 4 BusinessTravel
1 5 BusinessTravel
2 1 BusinessTravel
2 2 BusinessTravel
2 3 BusinessTravel
2 4 BusinessTravel
2 5 BusinessTravel
3 1 BusinessTravel
3 2 BusinessTravel
3 3 BusinessTravel
3 4 BusinessTravel
3 5 BusinessTravel
4 1 BusinessTravel
4 2 BusinessTravel
4 3 BusinessTravel
4 4 BusinessTravel
4 5 BusinessTravel
5 1 BusinessTravel
5 2 BusinessTravel
5 3 BusinessTravel
5 4 BusinessTravel
5 5 BusinessTravel

There are no errors being displayed, hopefully, someone here may be able to help

A brief subset of the data is below A brief subset of the data is below

Carl O'Beirne
  • 309
  • 1
  • 6
  • 17
  • Have you checked the logged events? Please provide more information about your data, preferably a reproducible subset. – kath Oct 31 '19 at 12:59
  • I have added a screenshot of some of the subset of the data. I am unsure how to check the logged events, I am still very new to R and R Studio – Carl O'Beirne Oct 31 '19 at 13:10
  • 1
    Unfortunately, a screenshot is not a reproducible format. It tells us nothing about the column types of the data. See [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) You should do the following: Check whats in `mice_mod$loggedEvents`, check whether BusinessTravel is a character column,... you should read more about mice e.g. here https://stefvanbuuren.name/fimd/ – kath Oct 31 '19 at 13:21
  • I was able to figure this out. It turned out that the variable was saved as a character and not a factor and that's why it would not allow for mice to work – Carl O'Beirne Nov 01 '19 at 19:31

1 Answers1

1

The columns were being stored as characters instead of factors and would not allow for mice to work correctly

Carl O'Beirne
  • 309
  • 1
  • 6
  • 17