Note that the solutions to the question below return factor columns as factor and do not re-order the factor levels.
The approach in the question could re-order factor levels because that
code changes the factor columns to character and then back to factor
using default ordering (which may not have been the initial ordering).
For example, this shows what can happen if tolower
is applied
directly to a factor:
fac <- factor(c("One", "None", "Two"), levels = c("One", "Two", "None"))
fac
## [1] One None Two
## Levels: One Two None
factor(tolower(fac)) # order of levels has changed!
## [1] one none two
## Levels: none one two
In particular, that implies that sort(fac)
does not correspond to
the order in sort(factor(tolower(fac)))
.
We now discuss some alternative solutions to the question which do not have the reordering problem.
1) Create a function lc_lev
which lower cases levels of a factor and passes the input through unchanged if not a factor. Then lapply it over the columns of the input -- here we use the built in CO2 -- and change it back to a data.frame of the same shape:
lc_lev <- function(x) {
if (is.factor(x)) levels(x) <- tolower(levels(x))
x
}
replace(CO2, TRUE, lapply(CO2, lc_lev))
1a) This would also work:
CO2[] <- lapply(CO2, lc_lev)
2) Another approach is to use S3. The generic (first line) dispatches to the factor method if the input is a factor and to the default method otherwise:
lc_lev2 <- function(x, ...) UseMethod("lc_lev2")
lc_lev2.factor <- function(x) { levels(x) <- tolower(levels(x)); x }
lc_lev2.default <- identity
CO2[] <- lapply(CO2, lc_lev2)