What you show is feasible, but you can simplify your code to a single call doing the comparison via the %in%
binary operator. Here is an example using some dummy data:
set.seed(1)
var <- factor(sample(c("missing","unknown","uncoded", 1:4), 100, replace = TRUE))
This gives us a factor vector like this:
> head(var)
[1] unknown uncoded 2 4 unknown 4
Levels: 1 2 3 4 missing uncoded unknown
> table(var)
var
1 2 3 4 missing uncoded unknown
14 15 17 13 10 18 13
To set all those values coded as any of c("missing","unknown","uncoded")
to NA
, we do it in a single shot:
var2 <- var ## copy for demo purposes, but you can over write if you wish
var2[var2 %in% c("missing","unknown","uncoded")] <- NA
which gives
> var2[var2 %in% c("missing","unknown","uncoded")] <- NA
> head(var2)
[1] <NA> <NA> 2 4 <NA> 4
Levels: 1 2 3 4 missing uncoded unknown
> table(var2)
var2
1 2 3 4 missing uncoded unknown
14 15 17 13 0 0 0
Notice how the original levels are preserved. If you want to remove those levels then we can apply the droplevels()
function to var2
:
var2 <- droplevels(var2)
which gives
> head(var2)
[1] <NA> <NA> 2 4 <NA> 4
Levels: 1 2 3 4
> table(var2)
var2
1 2 3 4
14 15 17 13
Also note that by default the NA
are not shown in the tabular output, but we rectify that to show you that they are still there:
> table(var2, useNA = "ifany")
var2
1 2 3 4 <NA>
14 15 17 13 41