0

I created the following function to convert every character column of a data frame (x) into a factor one, but I got an error message as "Error in if (e[i]) { : argument is not interpretable as logical." Any help would be appreciated.

f<-function(x){
e<-lapply(x, is.character)
i <- 1
while (i >= 1) {
if(e[i]) {as.factor(x[[i]])}
else {x[i]}
}
x
}
Guess Gucci
  • 253
  • 1
  • 3
  • 11
  • 2
    Not quite a duplicate; more like the inverse of [this](http://stackoverflow.com/q/2851015/324364). The accepted answer probably isn't what you're looking for, but several of the others will be helpful. – joran Feb 13 '13 at 20:15
  • @Joran hope you don't mind I stole from this and used as a response. Maybe we should back the trolley up all together and see if the data was read in as character/factor maybe we could have accomplished this on the read in. – Tyler Rinker Feb 13 '13 at 20:19

3 Answers3

4

You can use :

char2factor <- function(df) {
  data.frame(lapply(df, function (v) {
    if (is.character(v)) factor(v)
    else v
  }))
}

For example, if you had the following data :

df <- data.frame(v1=LETTERS[1:5],v2=1:5,stringsAsFactors=FALSE)
df
#   v1 v2
# 1  A  1
# 2  B  2
# 3  C  3
# 4  D  4
# 5  E  5
lapply(df, class)
# $v1
# [1] "character"
# 
# $v2
# [1] "integer"

You would get :

char2factor(df)
#   v1 v2
# 1  A  1
# 2  B  2
# 3  C  3
# 4  D  4
# 5  E  5
lapply(char2factor(df), class)
# $v1
# [1] "factor"
# 
# $v2
# [1] "integer"
juba
  • 47,631
  • 14
  • 113
  • 118
  • Why would R know the variable "v" actually refers to the columns? Thanks. – Guess Gucci Feb 13 '13 at 21:49
  • A data frame is a list of columns. Using lapply on a data frame applies the function given as an argument to each column in turn. And the column is given as an argument to this function. So v here is, successively, each column of your data frame. Hum, not sure to be very clear... – juba Feb 13 '13 at 21:53
  • The example given by R below regarding lapply is easy to understand since the function "mean" contains no variable at all. Thanks anyway. x <- list(a = 1:10, beta = exp(-3:3), logic = c(TRUE,FALSE,FALSE,TRUE)) lapply(x, mean) – Guess Gucci Feb 13 '13 at 22:09
3

EDIT: Per joran's comment (this can be done in one succinct line):

Use:

data.frame(lapply(dat, "["), stringsAsFactors = TRUE)

In context:

#make fake data
dat <- data.frame(w = state.abb [1:10], x=LETTERS[1:10], y=rnorm(10), 
   z =1:10, stringsAsFactors = FALSE)
str(dat)

dat2 <- data.frame(lapply(dat, "["), stringsAsFactors = TRUE)
str(dat2)

This is the approach I think I would take (EDIT-not anymore):

FUN <- function(x) {
    if (is.character(x)) {
        x <- as.factor(x)
    }
    x
}

for(i in seq_along(inds)) {
    dat[, i] <- FUN(dat[, i])
}

str(dat)
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
1

Using colwise from plyr, you can do

    dat <- colwise(function(x) {
                        if(is.character(x)) as.factor(x) else x
                    })(dat)
agstudy
  • 119,832
  • 17
  • 199
  • 261