I'm working with imputation with some data in R. I found a code online to perform imputation and then modeling the imputed data and the original data. The code is this:
# Using airquality dataset
data <- airquality
data[4:10,3] <- rep(NA,7)
data[1:5,4] <- NA
# Removing categorical variables
data <- airquality[-c(5,6)]
summary(data)
# Impute missing data using mice
library(mice)
tempData <- mice(data,m=5,maxit=50,meth='pmm',seed=500)
summary(tempData)
# Get completed datasets (observed and imputed)
completedData <- complete(tempData,1)
summary(completedData)
# Plots
# Density plot original vs imputed dataset
densityplot(tempData)
This is my syntax:
library(readr)
input_preg<- read_csv("datasurvey.csv")
summary(input_preg)
imput<- input_preg
#Imputation
library(mice)
temporal <- mice(imput,m=5,maxit=50,meth='pmm',seed=500)
#example imputed
temporal$imp$`52bcalif`
#I selected a dataset for imputation
completos<-complete(temporal,1)
#Ploting
densityplot(temporal)
So i'm doing almost exactly what the code indicates and when I'm doing the densityplot it doesnt work stating:
Error in `[.data.frame`(r, , xvar) : undefined columns selected
But with the original code, it has no problems to do the densityplot. So I dont know if it is because of the large number of imputations or that original data had 4 variables and I have 29.