Warning: Input string not available in this locale

Question

Rstudio Version 1.0.136

R Version 3.3.2

It's strange that when I run code(it has Chinese comment in code)line by line in a .Rmd file with Rmarkdown，console will print a warning as follow:

Warning message:
In strsplit(code, "\n", fixed = TRUE) :
   input string 1 is invalid in this locale

It's so annoying ,because every line it will appear. I has change default text encoding in RStudio's setting ,but neither UTF-8 nor BG2312 can prevent this warning messag appearing. Please notice that it just appear when a run code line by line ,if I select a chunk an press button to produce a html，warning doesn't appear. my code is as follows:

```{r}
da=read.table("m-intcsp7309.txt",header=T)
head(da)
# date intel sp三列
length(da$date)
# 444数据
intc=log(da$intc+1)
# 测试
plot(cars)
# 测试警告信息
plot(cars)
# 为什么会出现警告？
plot(cars)
```

I have test it's not arise from Chinese comment,I meet it when I only use English
just now. Here is more information:

Sys.getlocale()
[1] "LC_COLLATE=Chinese (Simplified)_People's Republic of China.936;
     LC_CTYPE=Chinese (Simplified)_People's Republic of China.936;
     LC_MONETARY=Chinese (Simplified)_People's Republic of China.936;
     LC_NUMERIC=C;LC_TIME=Chinese (Simplified)_People's Republic of China.936"

@Martin It seems ok,thank you for your answer sincerely,but why?I have search on Internet,nobody other has this problem. — lemmingxuan, Jan 18 '17 at 12:39
It seems like the text file you are reading in contains a character that is not available in your original chinese locale. Not sure how it was encoded etc. The warning is carried on for every executed line even though it did not occur again. — Martin Schmelzer, Jan 18 '17 at 13:14
@Martin Sadly,everytime I restart Rstudio,it may occur again，and locale setting has restored,I need to setlocale again. — lemmingxuan, Jan 19 '17 at 02:37

score 26 · Answer 1 · answered Aug 07 '18 at 10:25

26

I had a similar issue with gsub() and was able to resolve it, without changing the locale, simply by setting useBytes = TRUE. The same should work in strsplit(). From the documentation:

If TRUE the matching is done byte-by-byte rather than character-by-character, and inputs with marked encodings are not converted.

answered Aug 07 '18 at 10:25

crsh

1,699
16
33

2

I think the `useBytes = TRUE` option is the best way to solve the issue – Alexander Leow Oct 16 '18 at 08:16

score 14 · Answer 2 · answered Sep 24 '17 at 16:53

14

Embed this directly in the Rmarkdown script that contains the Chinese character comment(s):

Sys.setlocale('LC_ALL','C')

If you just run it in the R console before running the rmarkdown script, that may temporarily change the setting and work, but as you said, it won't stay that way if you restart R. That's why it's better to directly embed that line into the script(s) that need it.

answered Sep 24 '17 at 16:53

www

4,124
1
11
22

@lemmingxuan - Does this answer your question? – www Oct 05 '17 at 00:50
More specifically, `Sys.setlocale('LC_CTYPE','C')` worked for me. Thanks! – nicolas.f.g Apr 29 '21 at 09:29
It worked for me as well, but I don't really know what this command does – Reabo Jun 22 '22 at 13:54

score 1 · Answer 3 · answered Oct 21 '22 at 14:45

1

Setting useBytes = TRUE in gsub seems to work best. For example, gsub('pattern text','replacement text', useBytes = TRUE)

answered Oct 21 '22 at 14:45

LyricalStats9

55
5

how is this not a duplicate answer of the one by crsh? – IRTFM Oct 27 '22 at 23:06

score 0 · Answer 4 · answered Jun 15 '22 at 14:40

If you get this warning just after or during building your vignettes during the check() of your package, then it is probably linked to this problem: https://github.com/r-lib/rcmdcheck/issues/140
If you update {processx} and {rcmdcheck}, this should work better.

Warning: Input string not available in this locale

4 Answers4

Linked