I have a column of type factor. Some of the values in the columns are NA values. How do I convert all these NA values to a new level, say 0, or "OriginallyNA" or something.
I was able to convert NAs to 0 for columns of class numeric, but haven't been able to do it for columns of class factor.
My data
> col1 = c(1,2,3,4,NA)
> col2 = c(6,7,NA,NA,8)
> df = data.frame(col1,col2)
> df
col1 col2
1 1 6
2 2 7
3 3 NA
4 4 NA
5 NA 8
> df$col2 = as.factor(df$col2)
> class(df$col1)
[1] "numeric"
> class(df$col2)
[1] "factor"
Trying to convert the NA values to another level, say 0
> df[is.na(df)] = 0
Warning message:
In `[<-.factor`(`*tmp*`, thisvar, value = 0) :
invalid factor level, NA generated
> df
col1 col2
1 1 6
2 2 7
3 3 <NA>
4 4 <NA>
5 0 8
> levels(df$col2)
[1] "6" "7" "8"
Do I have to convert the factor column to numeric, change NA values to 0, and then convert it back to factor after conversion, as follows. Is there a better way?
> df$col2 = as.numeric(df$col2)
> df
col1 col2
1 1 1
2 2 2
3 3 NA
4 4 NA
5 0 3
> df[is.na(df)] = 0
> df
col1 col2
1 1 1
2 2 2
3 3 0
4 4 0
5 0 3
> df$col2 = as.factor(df$col2)
> df
col1 col2
1 1 1
2 2 2
3 3 0
4 4 0
5 0 3