2

I want just to convert two columns of a data frame to factors. I use the apply function but the result is characters, not factors. Any idea what I am doing wrong ?

aa <- c(1,2,3,4)
bb <- c(6,7,8,9)
xx <- data.frame(aa, bb)
xx

yy <- apply(xx, 2, function(xx) as.factor(xx))
#      aa  bb 
# [1,] "1" "6"
# [2,] "2" "7"
# [3,] "3" "8"
# [4,] "4" "9"

When I am implementing the same to a stand alone vector, it works:

nn <- c(1,2,3,4)
mm <- as.factor(nn)
mm
Developer
  • 917
  • 2
  • 9
  • 25
Apostolos
  • 598
  • 1
  • 5
  • 13
  • 1
    `apply` is returning a matrix, which requires that all elements be the same type. Try using `as.data.frame(lapply(xx, factor))`. – Benjamin Oct 26 '15 at 13:51
  • yes, it works. Please make it an answer. It is not obvious that this must be done via a list. – Apostolos Oct 26 '15 at 13:57
  • @akrun reopening this dupe is just childish behaviour. I can't close all the dupes on SO. If you can, you are welcome to do so. – David Arenburg Oct 26 '15 at 15:36

3 Answers3

3

apply is usually not suitable for data.frames, because it returns a matrix. You could use lapply instead:

yy <- data.frame(lapply(xx, as.factor))
str(yy)
#'data.frame':  4 obs. of  2 variables:
# $ aa: Factor w/ 4 levels "1","2","3","4": 1 2 3 4
# $ bb: Factor w/ 4 levels "6","7","8","9": 1 2 3 4

I assume you realize you could also just do

xx <- data.frame(aa = as.factor(aa), bb = as.factor(bb))
Molx
  • 6,816
  • 2
  • 31
  • 47
1

I would do something like:

library(dplyr)
yy = xx %>% mutate_each(funs(as.factor))

This applies as.factor to each column in xx.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • 1
    You could also do `xx %<>% mutate_each(funs(as.factor))` using the `magrittr` package in order to update `xx` without creating `yy` – David Arenburg Oct 26 '15 at 14:56
0

Or you can do

library(data.table)
setDT(xx)[, lapply(.SD, as.factor)]
akrun
  • 874,273
  • 37
  • 540
  • 662