1

Following the solution provided here I'm trying to implement a script that replaces accented characters. Executing my code from console works fine, however when I execute this code:

x <- list('ÿ'='y')

from a RStudio script R returns following error:

source('~/R/DrivingDataAnalysis/R/WebScraper/VoltstatsScraper.R', encoding = 'UTF-8') Error in source("~/TestScript.R", : ~/TestScript.R:1:6: unexpected INCOMPLETE_STRING 1: list(' ^

for other accented characters like in the link R parses the script. How can I make this work in a script? I'm using R 3.3.1 and RStudio 0.99.896 on Windows 7.

Community
  • 1
  • 1
David Go
  • 810
  • 1
  • 7
  • 13
  • I can't replicate this problem on RStudio 0.99.1266 (preview). Looks like UTF-8 encoding was introduced in 0.99.903 (see https://support.rstudio.com/hc/en-us/articles/200532197-Character-Encoding) so you may just need to update your version of RStudio – Phil Jul 29 '16 at 21:57
  • Thanks for your help. Updating RStudio to 0.99.903 did resolve the issue when setting the encoding to: ISO-8859-1. However for the ordinary UTF-8 encoding the problem remains. Have you tried to set you encoding to "UTF-8" to reproduce the error? Also setting the encoding i.e. to WINDOWS-1252 generates this error. – David Go Jul 29 '16 at 22:58
  • 1
    Non-ASCII characters + R + Windows == you're gonna have a bad time. – Kevin Ushey Jul 30 '16 at 05:12
  • yep pretty much looks like it :-). As this one is the only character of a larger list it is not the end of the world thanks anyways – David Go Jul 30 '16 at 09:11
  • Although Kevin is right, you generally should set your encoding to UTF-8, and there’s pretty much never a reason to use a legacy encoding such as Windows-foo/ISO-bar. To work around R bugs with Unicode on Windows, use [`eval(parse(…, encoding = 'UTF-8')` instead of `source(…)`](http://stackoverflow.com/q/24454559/1968). – Konrad Rudolph Aug 02 '16 at 15:19

1 Answers1

2

Converting my comment to an answer because I think it will fix your problem:

  • Ensure that your script is encoded in UTF-8. Everything else is madness.1
  • Instead of using source, which fundamentally doesn’t work, use

    eval(parse('your/file.r', encoding = 'UTF-8'))
    

    This works as a substitute for source. As the linked answer explains, the reason is that source tries (and fails) to convert the UTF-8-encoded characters, whereas parse doesn’t perform any transformation; instead, it loads the string verbatim from the file and simply marks it as UTF-8 encoded.

Unfortunately this you’ll have to do this manually, assuming that RStudio’s “source” button uses source rather than the above workaround internally.


1 Other Unicode transformations might make sense but are — as far as I know — not at all supported by R.

Community
  • 1
  • 1
Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214