3

I have a data frame that contains about 100 factorial variables that I would like to change into numeric type. How can I do it to the whole data frame? I know that I can do it per each variable by using this code for example: dat$.Var2<-as.numeric(dat$.Var2) but I would like to do it for a lot of variables. Here is an example data frame.

   dat <- read.table(text = " TargetVar  Tar_Var1    Var2       Var3
     0        0        0         7
     0        0        1         1
     0        1        0         3
     0        1        1         7
     1        0        0         5
     1        0        1         1
     1        1        0         0
     1        1        1         6
     0        0        0         8
     0        0        1         5
     1        1        1         4
     0        0        1         2
     1        0        0         9
     1        1        1         2  ", header = TRUE)
Arun
  • 116,683
  • 26
  • 284
  • 387
mql4beginner
  • 2,193
  • 5
  • 34
  • 73

1 Answers1

9

You can use lapply:

dat2 <- data.frame(lapply(dat, function(x) as.numeric(as.character(x))))

   TargetVar Tar_Var1 Var2 Var3
1          0        0    0    7
2          0        0    1    1
3          0        1    0    3
4          0        1    1    7
5          1        0    0    5
6          1        0    1    1
7          1        1    0    0
8          1        1    1    6
9          0        0    0    8
10         0        0    1    5
11         1        1    1    4
12         0        0    1    2
13         1        0    0    9
14         1        1    1    2


str(dat2)
'data.frame':   14 obs. of  4 variables:
 $ TargetVar: num  0 0 0 0 1 1 1 1 0 0 ...
 $ Tar_Var1 : num  0 0 1 1 0 0 1 1 0 0 ...
 $ Var2     : num  0 1 0 1 0 1 0 1 0 1 ...
 $ Var3     : num  7 1 3 7 5 1 0 6 8 5 ...
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
  • Thanks Sven, It works. Can I use sapply for this task? – mql4beginner May 05 '14 at 06:46
  • 6
    Just be careful using `as.numeric()` - see what happens with `as.numeric(factor(c(7, 1)))` - you might need `as.numeric(as.character(x))` – alexwhan May 05 '14 at 06:51
  • +1 @alexwhan, was going to say the same, see: http://stackoverflow.com/a/14717814/1036500 http://stackoverflow.com/a/6328860/1036500 & http://stackoverflow.com/a/2293313/1036500 – Ben May 05 '14 at 06:54
  • @user1024441, why do you want to use `sapply` for this? `lapply` is more appropriate. If you want to overwrite the values in `dat` (not create a new `data.frame`), you can also just use `dat[] <- lapply(dat, function(x) as.numeric(as.character(x)))` – A5C1D2H2I1M1N2O1R2T1 May 05 '14 at 07:04
  • Thanks Ananda, Can you please explain why lapply is more appropriate than sapply ? – mql4beginner May 05 '14 at 07:10
  • @alexwhan Good point. I modified the code accordingly. – Sven Hohenstein May 05 '14 at 08:40
  • 2
    @user1024441, a `data.frame` is essentially a special type of `list`. If you use the approach I describe, you are essentially directly replacing the columns. Also, `lapply` is generally faster than `sapply` because `sapply` calls `lapply` anyway, and then checks to see whether the output can be simplified to an array (using the `simplify2array` function). You can do a few benchmarks to check on your own, but my quick test shows that even with this small dataset, `lapply` is considerably faster. – A5C1D2H2I1M1N2O1R2T1 May 05 '14 at 10:20