2

I want to convert the column $Annual.income saved as character in my dataframe to numeric. The code I use gives NA values although the new class is numeric.

I have looked for answer on forums but none of the questions answer my problem: I do not have NAs in the column Annual.income, there are only numbers. All the data is formated so as to have "." instead of "," for decimals . Here is the code I use.

data$Annual.income <- as.numeric(as.character(data$Annual.income))

******************************UPDATE********************************************

Here is the dput of the column Annual.income.

dput(data$Annual.income)
c("34 500", "51 400", "43 200", "40 100", "36 400", "39 100", 
"41 900", "48 700", "45 500", "45 500", "49 100", "35 100", "34 500", 
"29 200", "32 200", "36 300", "35 800", "31 500", "33 000", "34 600", 
"32 100", "32 000", "31 400", "33 200", "42 600", "29 200", "34 600", 
"29 200", "34 100", "30 600", "34 034", "33 600", "31 000", "35 500", 
"30 600", "30 600", "30 600", "30 800", "34 034", "33 200", "32 900"
)

The following still gives me NAs.

data$Annual.income <- as.numeric(data$Annual.income))

I imported the data using the Import dataset command of the Environement and unchecked stringAsfactor, checked heading = YES. Seperator = Semicolon , decimal = Period. Thanks ...

Lucas Snow
  • 35
  • 4
  • Are you sure it did not miss any `,` when converting to `.`? Also do you have any missing values where the `NA` indicating them is a string? – Sotos May 17 '19 at 13:16
  • Just to note `read.table` has `dec=` argument. – zx8754 May 17 '19 at 13:17
  • 1
    Please provide example input file, and the code how that file was imported. – zx8754 May 17 '19 at 13:18
  • ... or put the output of `dput(data$Annual.income)` in your question! Please read https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – jogo May 17 '19 at 13:47

1 Answers1

1

The white space causes the problem here, simply remove all white space characters with gsub(), e.g.

Annual.income <- c("34 500", "51 400", "43 200", "40 100", "36 400", "39 100", 
  "41 900", "48 700", "45 500", "45 500", "49 100", "35 100", "34 500", 
  "29 200", "32 200", "36 300", "35 800", "31 500", "33 000", "34 600", 
  "32 100", "32 000", "31 400", "33 200", "42 600", "29 200", "34 600", 
  "29 200", "34 100", "30 600", "34 034", "33 600", "31 000", "35 500", 
  "30 600", "30 600", "30 600", "30 800", "34 034", "33 200", "32 900"
)

as.numeric(gsub("\\s", "", Annual.income))
#>  [1] 34500 51400 43200 40100 36400 39100 41900 48700 45500 45500 49100
#> [12] 35100 34500 29200 32200 36300 35800 31500 33000 34600 32100 32000
#> [23] 31400 33200 42600 29200 34600 29200 34100 30600 34034 33600 31000
#> [34] 35500 30600 30600 30600 30800 34034 33200 32900

Created on 2019-05-17 by the reprex package (v0.2.1)

Daniel
  • 7,252
  • 6
  • 26
  • 38