Change Factor Levels for a Data Frame

Asked Mar 04 '15 at 11:17

Active Mar 04 '15 at 11:17

Viewed 113 times

I am trying to predict new data that may, for some cases, have new factor levels than the data used to fit the model. As such, I want to change the factor levels in the new data to match those of the old data. I would change those instances where the data doesn't match to NAs as described here. I can do it manually column-by-column but I want to generalize this replacement to all the columns in my data frame. Could someone please give some insight into how to do this, presumably with apply?

I've tried using the function below

 lapply(newDta, function(x) {
    newFactorVector <- which(!(newDta[, x] %in% levels(oldDta[, x])))
    newDta[newFactorVector, x] <- NA
    levels(newDta[, x]) <- levels(oldDta[, x])
})

but it throws the following error:

Error in Summary.factor(c(2L, 1L, 7L, 1L, 7L, 2L, 2L, 2L, 2L, 7L, 1L,  :
min not meaningful for factors

Thanks.

edited May 23 '17 at 10:09

Community

asked Mar 04 '15 at 11:17

TSW

1

Could you provide some example data – akrun Mar 04 '15 at 11:43

Change Factor Levels for a Data Frame

0 Answers0