0

column1 is large column in a dataset Suppose that column1 in a datasethas values: 1,2,3,4 (datatype factor)- Then I define

column2 <- as.numeric(column1)

column2 is showing values: 2,3,4,5 #(+1 the column1 values)


column3 <- as.numeric(as.character(column1)) #now column3 is showing correct values i.e. 1,2,3,4

Cettt
  • 11,460
  • 7
  • 35
  • 58
Raj
  • 21
  • 5
  • Check `levels(column1)`. The first level will be something other than 1. `as.numeric` returns the positions in the levels. – Roland Oct 18 '19 at 11:35

1 Answers1

1

the reason is that factors cannot be transformed to numerics directly. Check this example:

x <- factor(0:3)
x
[1] 0 1 2 3
Levels: 0 1 2 3
as.numeric(x)
[1] 1 2 3 4
as.character(x)
[1] "0" "1" "2" "3"

In order to properly convert x to numeric you can do either this:

as.numeric(as.character(x))

or any other possibility suggested here.

In general when creating data.frames I would suggest working with factors by setting stringsAsFactors = F.

Cettt
  • 11,460
  • 7
  • 35
  • 58