0

I am working with some special characters in Rstudio. It coverts them into plain letters.

print("Safarzyńska2013")
[1] "Safarzynska2013"

x <- "Māori"
x
[1] "Maori"

Is there any way to read in the exact original characters. Following info might be helpful: Rstudio default encoding is UTF-8

sessionInfo()
R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
[1] tools_3.1.1
M.Qasim
  • 1,827
  • 4
  • 33
  • 58
  • 1
    May be this link `http://stackoverflow.com/questions/23324872/rstudio-not-picking-the-encoding-im-telling-it-to-use-when-reading-a-file` helps – akrun Sep 08 '14 at 10:48
  • Thanks akrun. I have seen this page. It did not worked for me. – M.Qasim Sep 08 '14 at 12:10

2 Answers2

1

This not an exclusively RStudio problem.

Typing print("Safarzyńska2013") on the console of RGui also converts them to plain letters. Running this code from an UTF-8 encoded Script in RGui returns [1] "Safarzy?ska2013".

I don't think that it is a good idea to type such special chars on the console. x <- "SomeString"; Encoding(x) returns "unknown" and that is probably the problem: R has no idea what encoding you are using on the console and probably has no chance to get your original encoding.

I put "Safarzyńska2013\nMāori\n" in a text file encoded with UTF-8. Then the following works fine:

tbl <- read.table('c:/test1.txt', encoding = 'UTF-8', stringsAsFactors = FALSE)
tbl[1,1]
tbl[2,1]
Encoding(tbl[1,1])  # returns "UTF-8"

If you really want to use the console, you probably will have to mask the special chars. In ?Encoding we find the following example to create a word with special chars:

x <- "fa\xE7ile"
Encoding(x)

Actually I don't know at the moment how to get these codes for your special chars and ?Encoding has also no hints...

Patrick Roocks
  • 3,129
  • 3
  • 14
  • 28
0

Go to the label File of RStudio, them click on Save with encoding... , Choose Encoding UTF-8 , Set as default encoding for source file and save.

Hope this helps