0

Thank you in advance for your help!

I need to convert the x.1 column to numeric, to having double float numbers in.

What I have done: 1. I imported the data with: training <- read.csv("training_grover.csv", stringsAsFactors = FALSE, sep = ";")

  1. str(training)

Result: data.frame': 2671 obs. of 22 variables: $ X : int 0 1 2 3 4 5 6 7 8 9 ... $ x.0 : chr "b" "a" "a" "b" ... $ x.1 : chr "30,83" "58,67" "24,5" "27,83" ... $ x.2 : chr "f" "4.46" "0.5" "1.54" ... $ x.3 : chr "u" "u" "u" "u" ... $ x.4 : chr "g" "g" "g" "g" ... $ x.5 : chr "w" "q" "q" "w" ... $ x.6 : chr "v" "h" "h" "v" ... $ x.7 : chr "1.25" "3.04" "1.5" "3.75" ... $ x.8 : chr "t" "t" "t" "t" ... $ x.9 : chr "t" "t" "f" "t" ... $ x.10: chr "t" "6" "f" "5" ... $ x.11: chr "f" "f" "f" "t" ... $ x.12: chr "g" "g" "g" "g" ... $ x.13: chr "202.0" "43.0" "280.0" "100.0" ... $ x.14: chr "f" "560" "824" "3" ... $ x.20: chr "t" "t" "t" "t" ... $ x.17: chr "116,94256980957068" "225,60625307204938" "92,08407670672422" "104,16291777029285" ... $ x.18: chr "0,5787085579422866" "25,409645364400404" "2,3173371593153314" "8,04533772976642" ... $ x.19: chr "202000.0" "43000.0" "280000.0" "100000.0" ... $ x.16: chr "f" "f" "f" "f" ... $ y : chr "good" "good" "good" "good" ..

  1. I tried to convert the x.1 column to numeric:

    training$x.1=as.numeric(training$x.1) As result I got the x.1 full of NAs

Actions:

a. I imported again the file

b. I removed the "," from the x.1: str_replace_all(training$x.1, ",", ".")

c. Trying again to convert the x.1 column: training$x.1=as.numeric(training$x.1) As result I still get x.1 full of NAs

d. Import again the file

e. I removed the "," from the x.1: str_replace_all(training$x.1, ",", ".")

f. Trying again to convert the x.1 column: training$x.1= as.numeric(as.factor(training$x.1)). Result: x.1 column is still full of NAs.

What I am doing wrong here? Thank you!

myhy
  • 9
  • 2

2 Answers2

2

There might be multiple ways to post-process the data after importing but you can fix the first step by importing the data correctly. Use dec = "," to specify character used to represent decimal points.

training <- read.csv("training_grover.csv", stringsAsFactors = FALSE, sep = ";", dec = ",")

These settings are default in read.csv2

training <- read.csv2("training_grover.csv", stringsAsFactors = FALSE)
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

Maybe you can try the code below for type conversion

training$x.1 <- as.numeric(gsub(",","\\.",training$x.1))
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81