What?
An .Rmd
file is error-free rendered via knitr
(or rmarkdown
) within from Linux. Related material (i.e. child R scripts and CSV input data) is all set in UTF-8.
Executing the same script from within Windows (actually the script is inside a cloned git repository), does not render all characters cleanly, since it's set to Windows-1252.
Examples
For example, the string "sans réserves"
, sourced from a CSV into some data.frame's column content, is typeset as "sans réserves"
. To read this one correctly, it suffices to add encoding='UTF-8'
to read.csv
, obviously while reading-in the data.
Another example, that concerns an entry among other R
code lines, is the string "Trésorier Général"
. It is typeset as "Trésorier Général"
. Fortunately, the following advice
read_chunk(lines = readLines("TestSpanishText.R", encoding = "UTF-8"))
taken from https://stackoverflow.com/a/15714617/1172302, works and the string is rendered as expected.
Related
[Update] There are some related Q&As, but they are more than 2-3 years old. As well, this page https://support.rstudio.com/hc/en-us/articles/200532197-Character-Encoding, points to the very issue.
Questions
Is there another, easier way to overcome this issue regarding UTF-8
and Windows, inside R
? Recommendations on how to approach such a problem? I am trying to follow a one source for all principle.
ps- An interesting reading: https://superuser.com/a/221602/128768