I am trying to read several SPSS files into R that include Cyrillic text
. All of the files are in Cyrillic text
. When I read most of them into R, the console says "re-encoding from CP1251". However, when I read some of the files, also in Cyrillic text
, it says "re-encoding from CP1252" which I think is a Latin script. The CP1251
files read into R with no problem. However, the CP1252
files become gibberish in R. I’ve tried the foreign
, haven
and hmisc
packages for reading in the SPSS files and none have worked. I've also tried including reencode='utf-8'
. When I do this, the Cyrillic text all becomes NA. The problem occurs whether I'm working in R or RStudio.
x1<- read.spss("cp1251_file.sav", to.data.frame = T) #1251 file reads in fine
x2<- read.spss("cp1252_file.sav", to.data.frame = T) #1252 file becomes gibberish
x2<- read.spss("cp1252_file.sav", to.data.frame = T, reencode='utf-8') #Cyrillic text in CP1252 file becomes NA
Thanks for your help.