I'm using the dataset House Prices: Advanced Regression Techniques, which includes multiple factor variables that have NA's among their levels. Consider the columns PoolQL, Alley and MiscFeatures. I want to replace for all these NA
's with None
in one function, but I fail to do so. Tried this so far:
MissingLevels <- function(x){
for(i in names(x)){
levels <- levels(x[i])
levels[length(levels) + 1] <- 'None'
x[i] <- factor(x[i], levels = levels)
x[i][is.na(x[i])] <- 'None'
return(x)
}
}
MissingLevels(df[,c('Alley', 'Fence')])
apply(df[,c('Alley', 'Fence')], 2, MissingLevels)
https://www.kaggle.com/c/house-prices-advanced-regression-techniques/data