0

Trying to convert a series of 29 columns in a dataframe 'df' to numeric variables. Each column currently contains strings that all look like this:

    two            three
    "-2.5346346"  "-4.2342342"
    "-3.645735"   "-2.23434542"
    "-4.235234"   "-1.23422" 



    as.character(two) 

works fine.

    as.numeric(as.character(two)) 

does not. as.numeric() returns all NAs, not even just NAs for certain observations.

In any case, there are not any extraneous commas, letters, etc. I cannot think what could be causing the problem and have run out of ideas. If it's at all relevant, I constructed the columns from vector strings (ex. c("-3.23423", "-2.34532)) where each string became a new column and now I'm wondering if there's something in the 'str_extract_all' function that I used to do that I'm not aware of. Thank you.

Edited to include sample data.

head(df)

         one               two             three              four         five       six
       1 c("-3.19474987"   "-3.9386188"   "-5.3585024"   "-7.3370402"  "-4.65656894"  "-5.37296894"
       2 c("-3.86805776"  "-2.57038981"  "-4.88910112"  "-3.82336021"  "-1.51641245"  "-4.19533412"
       3 c("-4.64324462"  "-3.51131105"  "-5.81064472"  "-6.63382723"  "-4.47048461"  "-7.08557932"
       4 c("-4.88484732"  "-3.48084998"  "-4.97011221"  "-5.36993391"  "-3.14765309"  "-4.60799153"
       5 c("-4.99299683"  "-3.26320573"   "-4.5861881"   "-5.3340004"  "-2.14507341"  "-3.30230272"
       6 c("-5.15376815"  "-4.08624463"  "-6.50014523"  "-5.49561174"  "-4.14988788"  "-6.57583067"
MeC
  • 463
  • 3
  • 17
  • 1
    What is your data.frames name? Are you calling `as.numeric(as.character(DATAFRAMENAME$V1))` ? Or do you have an individual vector `V1` which is causing confusion and you aren't even accessing the column in the data.frame at all? (Note you should provide a reproducible example https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example so I don't have to double check something simple like this isn't causing the problem) – Sarah Sep 25 '18 at 22:47
  • No, I'm calling a column in a dataframe. – MeC Sep 25 '18 at 23:50
  • Can you please provide a reproducible example? I can run `as.numeric(as.character("-2.5346346"))` just fine to get output of `[1] -2.534635`, so there's something about your setup that is unusual. Can you make a subset of your data that's not working and use `dput(your_subset_of_data)` so we can see in what form it's stored? – Jon Spring Sep 26 '18 at 00:01
  • edited to include data. – MeC Sep 26 '18 at 00:40
  • I figured it out. Not sure why, since these are strings it seems unnecessary, but removing punctuation takes care of it: `str_extract_all(df$two, "[-0-9\\.]+")` – MeC Sep 26 '18 at 00:44

0 Answers0