0

I read in this forum a lot of posts about converting the columns of my data.frame from character to numeric, but I did not succeed in solving my problem..

So..

My data.frame is loaded from my personal folder using

a <- read.table("pe.txt", header=T, sep="", dec=",")

the numbers in the pe.txt table have the "," to separate the decimal numbers. the output "a" is:

                    PHYLA  Ti_01  T2_01  T4_01  Ti_02
1           Acidobacteria  0.000  0.000  0.000  0.000
2          Actinobacteria  0.506  0.055  0.187  0.261
3 Archaea - Euryarchaeota  0.000  0.000  0.000  0.000
4           Bacteroidetes 48.902 57.823 28.495 53.450
5           Cyanobacteria  0.011  0.011  0.712  0.000

Afterwards, I deleted the column "PHYLA" putting it as "colnames".

Results:

                         Ti_01  T2_01  T4_01  Ti_02
Acidobacteria            0.000  0.000  0.000  0.000
Actinobacteria           0.506  0.055  0.187  0.261
Archaea - Euryarchaeota  0.000  0.000  0.000  0.000
Bacteroidetes           48.902 57.823 28.495 53.450
Cyanobacteria            0.011  0.011  0.712  0.000

Now, all the table is formed by number. so, I tried to use the code to transform all in numeric. No way! I use the code below to understand if the factors are numeric or character and this is the result:

    sapply(a,mode)
    PHYLA     Ti_01     T2_01     T4_01     Ti_02 
"numeric" "numeric" "numeric" "numeric" "numeric"

My goal at the end is to sum the rows between them...

apply(a,2,sum)
Errore in FUN(newX[, i], ...) : invalid 'type' (character) of argument    
rowSums(a)
Errore in rowSums(a) : 'x' deve essere di tipo numeric

In my opinion the error is during "read.table". Until that point the R output smells bad for me...

Giffredo
  • 79
  • 2
  • 8
  • 1
    Please check here http://stackoverflow.com/questions/2288485/how-to-convert-a-data-frame-column-to-numeric-type. It may be better to show the dput output as copy/pasting the example you showed didn't show any error with `rowSums` or `apply` – akrun May 02 '15 at 16:30
  • 2
    I suspect that the columns are factor class. There must be some non-numeric elements in the columns that convert it to factor class given that we are not specifying `stringsAsFactors= FALSE`. The `mode` gives numeric for `factor` `df1[] <- lapply(df1, factor); sapply(df1, mode)# Ti_01 T2_01 T4_01 Ti_02 . "numeric" "numeric" "numeric" "numeric" `. So instead, try `sapply(df1, class)` or just `str(df1)`. If it is factor, then convert it to numeric, by `df1[] <- lapply(df1, function(x) as.numeric(as.character(x)))` – akrun May 02 '15 at 16:35
  • Thanks for the replies.. so fast! yes, you are right, the columns are factor class.. Now i have to try to understand your solution.. df1[] <- lapply(df1, function(x) as.numeric(as.character(x))) I am inside R since 3 days and it is not clear why using this code instead: df1[] <- lapply(df1, as.numeric(as.character(x))) And.. in which moment I should specify stringsAsFactors= FALSE ? – Giffredo May 03 '15 at 11:24
  • In the modified code, there is no `x`. You need to call anonymous `function` ie. `function(x)`. You can specify `stringsAsFactors=FALSE` within the `read.csv/read.table` so that the non-numeric columns will be character class, which you can later convert by `df1[] <- lapply(df1, as.numeric)` – akrun May 03 '15 at 12:09
  • OK! It is clearer now! thx akrun, you are been very useful! – Giffredo May 03 '15 at 17:01

0 Answers0