I am attempting to read data from the National Health Interview Survey in R: http://www.cdc.gov/nchs/nhis/nhis_2011_data_release.htm . The data is Sample Adult. The SAScii
library actually has a function read.SAScii
whose documentation has an example for the same data set I would like to use. The issue is it "doesn't work":
NHIS.11.samadult.SAS.read.in.instructions <-
"ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Program_Code/NHIS/2011/SAMADULT.sas"
NHIS.11.samadult.file.location <-
"ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Datasets/NHIS/2011/samadult.zip"
#store the NHIS file as an R data frame!
NHIS.11.samadult.df <-
read.SAScii (
NHIS.11.samadult.file.location ,
NHIS.11.samadult.SAS.read.in.instructions ,
zipped = T, )
#or store the NHIS SAS import instructions for use in a
#read.fwf function call outside of the read.SAScii function
NHIS.11.samadult.sas <- parse.SAScii( NHIS.11.samadult.SAS.read.in.instructions )
#save the data frame now for instantaneous loading later
save( NHIS.11.samadult.df , file = "NHIS.11.samadult.data.rda" )
However, when running it I get the error Error in toupper(SASinput) : invalid multibyte string 533
.
Others on Stack Overflow with a similar error, but for functions such as read.delim
and read.csv
, have recommended to try changing the argument to fileEncoding="latin1"
for example. The problem with read.SAScii
is it has no such parameter fileEncoding
.
See: R: invalid multibyte string and Invalid multibyte string in read.csv