As a new R user I'm having trouble understanding why the NA valus in my dataframe keep changing. I'm running my code on Kaggle. Maybe that's where my problem is arising from?
Original dataframe titled "abc"
There are multiple columns that have NA values so I decided to try using multiple imputation to handle the na values.
So I created a new dataframe with just the columns that had na values and begin imputation This is the new dataframe titled "abc1"
abc1 <- select(abc, c(9,10,15,16,17,18,19,25,26))
#mice imputation
input_data = abc1
my_imp = mice(input_data, m=5, method="pmm", maxit=20)
summary(input_data$m_0_9)
my_imp$imp$m_0_9
When the imputation begins it creates 5 columns that contain new values to fill in for the NA values of column m_0_9 and I choose which column.
Then I run this code:
final_clean_abc1 <- complete(my_imp,5)
This assigns the values from column 5 of the last image to the NA values in my "abc1" dataframe and saves as "final_clean_abc1."
Lastly I replace the columns from the original "abc" dataframe that had missing values with the new columns in "final_clean_abc1."
I know this probably isnt the cleanest:
abc$m_0_9 <- final_clean_abc1$m_0_9
abc$m_10_12 <- final_clean_abc1$m_10_12
abc$f_0_9 <- final_clean_abc1$f_0_9
abc$f_10_12 <- final_clean_abc1$f_10_12
abc$f_13_14 <- final_clean_abc1$f_13_14
abc$f_15 <- final_clean_abc1$f_15
abc$f_16 <- final_clean_abc1$f_16
abc$asian_pacific_islander <- final_clean_abc1$asian_pacific_islander
abc$american_indian <- final_clean_abc1$american_indian
Now that I have a dataframe 'abc' with no missing values this is where my problem arises. I should be seeing '162' for row 10 for the m_0_9 column but when I save my code and view it on Kaggle I get the value '7' for that specific row and column. As shown in the photo below.
"abc" dataframe with no NA values
Hopefully this makes sense I tried to be as specific as I could be.