11

I have a data which contains factor class , so while converting it to numeric , i'm getting this warning message . following code i've written in R to convert factor into numeric

class(usedcars$Price)
[1] "factor"

e <- paste(usedcars$Price)
e <- as.numeric(paste(usedcars$Price))
Warning message:
NAs introduced by coercion 

Guys all the data is converted into "NA" but class is numeric. Could anyone help me out to get rid of this NA warning message while converting a factor to numeric in R?

sam
  • 229
  • 1
  • 4
  • 8
  • Is your question: How do I convert a number stored as a factor to numeric? If so, you don't just want to get rid of the warning, presumably... does `e <- as.numeric(as.character(usedcars$Price))` help? – alexwhan Oct 01 '13 at 12:43
  • 1
    @alexwhan-yes it convert the data to numeric but all of my data is changed to NA – sam Oct 01 '13 at 12:50
  • OK, perhaps posting your data would be a good idea...? – alexwhan Oct 01 '13 at 12:58
  • @alexwhan - friend it contains 72000 rows :( – sam Oct 01 '13 at 13:01
  • 2
    Have a look at http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. Post `str(usedcars)`, `head(usedcars)` etc – alexwhan Oct 01 '13 at 13:04
  • `str(usedcars)` $ RefId : int 1 2 3 4 5 6 7 8 9 10 ... $ IsBadBuy : int 0 0 0 0 0 0 0 0 0 0 ... $ Price : Factor w/ 10316 levels "0","10000","10001",..: 7882 7887 $ PurchDate : Factor w/ 518 levels "01-04... – sam Oct 01 '13 at 13:09
  • 1
    Why is `paste` here when you just want to convert factors into numeric? – Metrics Oct 01 '13 at 14:04
  • @Metrics `> all(paste(1:10)==as.character(1:10)) [1] TRUE` even though is not 'best practice' – Michele Oct 01 '13 at 14:48
  • @sam in place of writing a comment please edit the question and post the result of `dput(head(usedcars))` – Michele Oct 01 '13 at 14:51

4 Answers4

19

This happens when you use as.numeric on non-numeric variables.

my guess is that your numbers have "," in them (for example 1,285) so first make your factors "clean" with db <- gsub(",","",db) and then run as.numeric(db)

Sandler
  • 191
  • 2
  • 3
5

I know this was asked a long time ago but since it doesn't have an accepted answer I would like to add this:

e <- as.numeric(as.factor(usedcars$Price))

When paste is being used, it is essentially converting the price into character and then to numeric and it doesn't work mostly because of the properties of a dataframe.

Leocode
  • 83
  • 1
  • 6
4

I'll try to replicate your problem:

set.seed(1)
a <- factor(sample(1:100, 10))
> a
 [1] 27 37 57 89 20 86 97 62 58 6 
Levels: 6 20 27 37 57 58 62 86 89 97

The alexwhan comment is fine actually:

> as.numeric(as.character(a))
 [1] 27 37 57 89 20 86 97 62 58  6

Even if your data needs to be trim()ed it would work anyway:

> paste( " ", a, " ")
 [1] "  27  " "  37  " "  57  " "  89  " "  20  " "  86  " "  97  " "  62  " "  58  " "  6  " 
> as.numeric(paste( " ", a, " "))
 [1] 27 37 57 89 20 86 97 62 58  6

SO the only explanation is you have some (unexpected) character in all your numbers

> as.numeric(paste(a, "a"))
 [1] NA NA NA NA NA NA NA NA NA NA
Warning message:
NAs introduced by coercion 

If you can't see any letter the following happened to me:

> paste( intToUtf8(160), a, intToUtf8(160))
 [1] "  27  " "  37  " "  57  " "  89  " "  20  " "  86  " "  97  " "  62  " "  58  " "  6  " 
> as.numeric(paste( intToUtf8(160), a, intToUtf8(160)))
 [1] NA NA NA NA NA NA NA NA NA NA

intToUtf8(32) is the usual white space from the keyboard (like above some lines) but the number 160 is something that looks similar what is another different thing, which as.numeric (and also trim from gdata) doesn't recognise and returns NA.

Michele
  • 8,563
  • 6
  • 45
  • 72
2

You could try retype from the hablar package. If the problem is commas instead of dots, it replaces them with dots. Example:

library(hablar)
library(dplyr)

df <- tibble(a = as.factor(c("1,56", "5,87")))

df %>% retype()

gives you:

# A tibble: 2 x 1
      a
  <dbl>
1  1.56
2  5.87
davsjob
  • 1,882
  • 15
  • 10