0

I was trying to convert one of the columns (X.2) of my data frame (frost) from factor to numeric. When I try only X.2 instead of frost$X.2 it seems to work but when I ask str(frost) the column is still a factor.

frost=read.csv2("Database_REL_Umea_aktuell.csv")
frost

as.numeric(as.character(frost$X.2))
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [28] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [55] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
 [82] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[109] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[136] NA NA
Warning message:
NAs introduced by coercion

str(frost)

'data.frame':   137 obs. of  6 variables:
 $ Database.frost.damage.Umea: Factor w/ 7 levels "","Hylocomium splendens (HS)",..: 1 5 3 3 3 3 3 3 3 3 ...
 $ X                         : Factor w/ 5 levels "","C","SR1","SR10",..: 1 5 4 4 4 4 4 4 4 4 ...
 $ X.1                       : Factor w/ 11 levels "","C-1","C-2",..: 1 5 9 9 9 10 10 10 11 11 ...
 $ X.2                       : Factor w/ 136 levels "","0,012573",..: 1 136 110 99 129 105 82 112 94 69 ...
 $ X.3                       : Factor w/ 5 levels "","a","b","c",..: 1 5 2 3 4 2 3 4 2 3 ...
 $ X.4                       : logi  NA NA NA NA NA NA ...

Does anyone no why it wouldn't work? Thanks for the help!

Andrew Savinykh
  • 25,351
  • 17
  • 103
  • 158
  • 1
    Welcome to StackOverflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to produce a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). This will make it much easier for others to help you. – Jaap Jun 25 '14 at 16:07
  • 2
    Why not use `stringsAsFactors = FALSE` in your call to `read.csv2`? Furthermore, if the column was really numeric, R would have read it as numeric. This leads me to believe you have non numeric values in that column. There is no reason to have to do this if the data is read into R correctly. – Rich Scriven Jun 25 '14 at 16:20
  • Use `dec = ","` when you read the data. – Roman Luštrik Jun 25 '14 at 17:32
  • @RomanLuštrik , `read.csv2()` should have taken care of that. – Ben Bolker Jun 25 '14 at 17:36
  • @BenBolker good point. Something funky is going on then. – Roman Luštrik Jun 25 '14 at 18:07
  • use the answers from http://stackoverflow.com/questions/15236440/as-numeric-with-comma-decimal-separators to convert to numeric, then take a look at the original values of the elements that got converted to `NA`. – Ben Bolker Jun 25 '14 at 18:19
  • Can you please provide the output of `frost$X.2` ? – David Arenburg Jun 27 '14 at 06:00

1 Answers1

0

When you type this,

as.numeric(as.character(frost$X.2))

It doesn't change anything in the data frame, it just prints it on screen.

As far as why you are getting NAs, it looks like you have several blank observations or observations of the form "X,XXXX" which are going to be converted to NAs when you do this.

So try converting them to the form "X.XXX" and then converting to numeric:

frost$X.2 <- sapply(sapply(frost$X.2, gsub, patt=",", replace="."), as.numeric)
kng229
  • 473
  • 5
  • 13
  • This is will not solve his problem. He is getting `NA`s for all of his values when running `as.numeric(as.character(frost$X.2))` – David Arenburg Jun 27 '14 at 06:01
  • Agreed, it's difficult with the limited info but went ahead and edited my answer to make a suggestion on how to fix that since the output to str() provide some clues! – kng229 Jun 27 '14 at 15:59