Is there a knitr option to force UTF-8 encoding in included R files?

Question

I am using Windows 7, R2.15.3 and RStudio 0.97.320 with knitr knitr_1.1.6 (downloaded after Yihui fixed the 'Encoding: knitr and child files' issue on March 12)

> sessionInfo()
R version 2.15.3 (2013-03-01)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Spanish_Argentina.1252  LC_CTYPE=Spanish_Argentina.1252    LC_MONETARY=Spanish_Argentina.1252
[4] LC_NUMERIC=C                       LC_TIME=Spanish_Argentina.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] lattice_0.20-13    pixmap_0.4-11      RColorBrewer_1.0-5 ade4_1.5-1         pander_0.3.1      
[6] xtable_1.7-1      

loaded via a namespace (and not attached):
[1] digest_0.6.3   evaluate_0.4.3 formatR_0.7    grid_2.15.3    knitr_1.1.6    stringr_0.6.2  tools_2.15.3

I have my R code in a file like this one:

## @knitr RunMyCode 
print('Called from .R file: á é í ó ú ñ')  

# Workaround
my.text <- 'á é í ó ú ñ'
Encoding(my.text) <- "UTF-8"
print(my.text)

I call it from a Rmd file such as this:

Title
========================================================
Spanish text: á é í ó ú ñ

Use it from .Rmd code: it comes out right...
```{r}
print('á é í ó ú ñ')
```

```{r ReadFunctions, cache=FALSE, echo=TRUE, eval=TRUE}
read_chunk('TestSpanishText.R')
```

Spanish text comes out garbled here:
```{r RunMyCode, echo=TRUE, eval=TRUE, cache=TRUE, dependson=c('ReadFunctions')}
```

My problem is with spanish characters typed in the .R file (which is UTF-8 encoded in RStudio). These characters are OK if typed in Rmd files (both parent and child files work well), but not in R files. As you can see below, Encoding() does provide a workaround, but I wonder if there is another way, such as a global option? If I use Encoding(), I get the inverse problem in the RStudio console...

Title

Spanish text: á é í ó ú ñ

Use it from .Rmd code: it comes out right...

print("á é í ó ú ñ")
## [1] "á é í ó ú ñ"
read_chunk("TestSpanishText.R")

Spanish text comes out garbled here:    
print("Called from .R file: Ã¡ Ã© Ã Ã³ Ãº Ã±")
## [1] "Called from .R file: Ã¡ Ã© Ã Ã³ Ãº Ã±"

# Workaround
my.text <- "Ã¡ Ã© Ã Ã³ Ãº Ã±"
Encoding(my.text) <- "UTF-8"
print(my.text)
## [1] "á é í ó ú ñ"

Thank you!

score 5 · Accepted Answer · answered Mar 30 '13 at 03:24

5

Ideally I should have an encoding argument in read_chunk(), but since you were using UTF-8, this probably works:

read_chunk(lines = readLines("TestSpanishText.R", encoding = "UTF-8"))

Please try this first. If it does not work, I will add an encoding argument. Anyway, I'm sure this definitely works (it is just slightly longer):

con = file("TestSpanishText.R", encoding = "UTF-8")
read_chunk(con)
close(con)

answered Mar 30 '13 at 03:24

Yihui Xie

28,913
23
193
419

I tried this approach (`con = file("TestSpanishText.R", encoding = "UTF-8"); read_chunk(con);close(con)`) for my problem: https://stackoverflow.com/questions/48307007/printing-utf-8-russian-characters-in-r-rmd-knitr, but could not make it work. ??? – IVIM Jan 19 '18 at 15:19

Is there a knitr option to force UTF-8 encoding in included R files?

1 Answers1

Linked