26

Rstudio Version 1.0.136

R Version 3.3.2

It's strange that when I run code(it has Chinese comment in code)line by line in a .Rmd file with Rmarkdown,console will print a warning as follow:

Warning message:
In strsplit(code, "\n", fixed = TRUE) :
   input string 1 is invalid in this locale

It's so annoying ,because every line it will appear. I has change default text encoding in RStudio's setting ,but neither UTF-8 nor BG2312 can prevent this warning messag appearing. Please notice that it just appear when a run code line by line ,if I select a chunk an press button to produce a html,warning doesn't appear. my code is as follows:

```{r}
da=read.table("m-intcsp7309.txt",header=T)
head(da)
# date intel sp三列
length(da$date)
# 444数据
intc=log(da$intc+1)
# 测试
plot(cars)
# 测试警告信息
plot(cars)
# 为什么会出现警告?
plot(cars)
```

I have test it's not arise from Chinese comment,I meet it when I only use English
just now. Here is more information:

Sys.getlocale()
[1] "LC_COLLATE=Chinese (Simplified)_People's Republic of China.936;
     LC_CTYPE=Chinese (Simplified)_People's Republic of China.936;
     LC_MONETARY=Chinese (Simplified)_People's Republic of China.936;
     LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_People's Republic of China.936"
Martin Schmelzer
  • 23,283
  • 6
  • 73
  • 98
lemmingxuan
  • 549
  • 1
  • 7
  • 18

4 Answers4

26

I had a similar issue with gsub() and was able to resolve it, without changing the locale, simply by setting useBytes = TRUE. The same should work in strsplit(). From the documentation:

If TRUE the matching is done byte-by-byte rather than character-by-character, and inputs with marked encodings are not converted.

crsh
  • 1,699
  • 16
  • 33
14

Embed this directly in the Rmarkdown script that contains the Chinese character comment(s):

Sys.setlocale('LC_ALL','C')

If you just run it in the R console before running the rmarkdown script, that may temporarily change the setting and work, but as you said, it won't stay that way if you restart R. That's why it's better to directly embed that line into the script(s) that need it.

www
  • 4,124
  • 1
  • 11
  • 22
1

Setting useBytes = TRUE in gsub seems to work best. For example, gsub('pattern text','replacement text', useBytes = TRUE)

0

If you get this warning just after or during building your vignettes during the check() of your package, then it is probably linked to this problem: https://github.com/r-lib/rcmdcheck/issues/140
If you update {processx} and {rcmdcheck}, this should work better.

Sébastien Rochette
  • 6,536
  • 2
  • 22
  • 43