I have some results that I put in a data frame. I have some factor columns and many numeric columns. I can easily convert the numeric columns to numeric with indexing, as per the answer to this question.
#create example data
df = data.frame(replicate(1000,sample(1:10,1000,rep=TRUE)))
df$X1 = LETTERS[df$X1]
df$X2 = LETTERS[df$X2]
df$X3 = LETTERS[df$X3]
df[-1] <- sapply(df[-1], function(x) ifelse(runif(length(x)) < 0.1, NA, x))
#find columns that are factors
factornames = c("X1", "X2", "X3")
factorfilt = names(df) %in% factornames
#convert non-factor columns to numeric
df[, !factorfilt] = as.numeric(as.character(unlist(df[, !factorfilt])))
But when I want to do the same for my factor columns, I cant get the same indexing to work:
#convert factor columns to factor
df[, factorfilt] = as.factor(as.character(unlist(df[, factorfilt])))
class(df$X1)
[1] "character"
df[, factorfilt] = as.factor(as.character(df[, factorfilt]))
class(df$X1)
[1] "character"
df[, factorfilt] = as.factor(unlist(df[, factorfilt]))
class(df$X1)
[1] "character"
df[, factorfilt] = as.factor(df[, factorfilt])
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?
All of these return "character"
if I call class(df$X1)
, while if I run df$X1= as.factor(df$X1)
it returns "factor"
.
Why does indexing this way not work when I call as.factor
, but does if I call as.numeric
?