I am using a work machine that runs windows 7 and I am using R version 3.5.1 (2018-07-02). This is my first post to stackexchange and I am not an experienced programmer.
I have a .csv file that has many columns, so I am trying to read in only a few specific columns. I run into trouble when I try to read in some of the columns as numeric.
I have a work-around (specify all of the columns as character, and then convert the ones I need to numeric later), but I am very curious why my first way doesn't work.
If I use the code
col_to_read<-rep("NULL",46)
col_to_read[c(11,17,23)]<-"numeric"
col_to_read[2]<-"character"
col_to_read[5]<-"factor"
data<-read.csv("outcome-of-care-measures.csv",colClasses=col_to_read)
I get
Error in scan(file = file, what = what, sep = sep, quote = quote, dec = dec, :
scan() expected 'a real', got '"14.3"'
I have looked for similar questions asked on stackexchange and google, but the proposed solutions didn't work for me. This may be because my error is slightly different that than the others. Usually they report something like
scan() expected 'a real', got '14.3'
So the number doesn't have the additional set of quotes.
There are many columns in this data set, and the column names are very long so its hard to post what the data looks like in notepad, but the first row goes something like this
"010001","SOUTHEAST ALABAMA MEDICAL CENTER","1108 ROSS CLARK CIRCLE","","","DOTHAN","AL","36301","HOUSTON","3347938701","14.3",
This isn't the full row of data, I stopped at the 14.3 which is the first column I want to specify as numeric.
I have tried a number of read.csv and read.table permutations, one of which includes setting dec="," but I just get the same error. I do not live in a locale where commas are used for decimals. If I do not specify anything for colClasses, the fields I want to be numeric will by default be read as factor.
The output of sessionInfo() is
R version 3.5.1 (2018-07-02)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] swirl_2.4.3
loaded via a namespace (and not attached):
[1] httr_1.3.1 compiler_3.5.1 magrittr_1.5 R6_2.2.2 tools_3.5.1 RCurl_1.95-4.11
[7] yaml_2.2.0 stringi_1.1.7 stringr_1.3.1 digest_0.6.17 testthat_2.0.0 rlang_0.2.2
[13] bitops_1.0-6