-2

I have a data frame and I need to convert 2 variables from factor to numerical variables. I have a

        df$QTY.SHIPPED=as.numeric(df$QTY.SHIPPED)
        df$PRE.TAX.TOTAL.=as.numeric(df$PRE.TAX.TOTAL.)

The quantity shipped converts well. Because it is already in integer format. Howerver, the PRE.TAX.TOTAL. does not convert well.

    PRE.TAX.TOTAL.(Factor)  PRE.TAX.TOTAL.(Numerical)
       57.8                     3856
       210                      2159

Does anybody have an idea why it is converting this way?

Thank you

admdrew
  • 3,790
  • 4
  • 27
  • 39
user1783504
  • 331
  • 4
  • 7
  • 14

2 Answers2

2

convert to character first and then to numeric. Otherwise it will just be converting to the underlying integer that encodes the factor

> v<-factor(c("57.8","82.9"))
> as.numeric(v)
[1] 1 2
> as.numeric(as.character(v))
[1] 57.8 82.9
JPC
  • 1,891
  • 13
  • 29
1

You actually could read the documentation. Typing ?factor in console produces

Warning

The interpretation of a factor depends on both the codes and the "levels" attribute. Be careful only to compare factors with the same set of levels (in the same order). In particular, as.numeric applied to a factor is meaningless, and may happen by implicit coercion. To transform a factor f to approximately its original numeric values, as.numeric(levels(f))[f] is recommended and slightly more efficient than as.numeric(as.character(f)).

Thus, the more proper way would probably be as.numeric(levels(f))[f]

David Arenburg
  • 91,361
  • 17
  • 137
  • 196