1

I encountered an issue when read the chinese file on Linux system by rstudio.

The error as below.

dt <- read.csv(file = "/home/..../aa-0912.csv", header = T , sep=",")

Error in make.names(col.names, unique = TRUE) : 
  invalid multibyte string at '<be><ba><b5><c3><c8><cb>'

This csv file is written by rstudio on Window system w/o specified encoding, as below:

write.csv(file = "/home/.../aa-0912.csv", data)

And I can read correctly on window but when I copy this file on my Linux system the read.csv doesn't work.

The locale on Linux is :

Sys.getlocale()

[1] "LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"

The locale on Window is :
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

I am trying to read data by encoding="utf-8" but I got the similar error message.

Any help?

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Patric
  • 2,063
  • 17
  • 18
  • possible duplicate of [How to read excel file in Chinese character \[R\]?](http://stackoverflow.com/questions/19722561/how-to-read-excel-file-in-chinese-character-r) – Gago-Silva Jan 09 '14 at 10:30
  • possible duplicate of [R: invalid multibyte string](http://stackoverflow.com/questions/4993837/r-invalid-multibyte-string) – themel Jan 10 '14 at 09:47

1 Answers1

0

I'm not sure that this is the answer to your question.

I'll try to be as general as possible so that people having trouble in any language might have a solution:

First in the terminal local -a local would display all the available locales on your system.

Once you found the locale the right locale then on RStudio:

Sys.setlocale("LC_ALL","fr_FR.utf8") 

Sorry I don't seem to have any Chinese locale on my system. Other people have had the same issues: here and here

have also a look at ?Sys.setlocale in R.

Community
  • 1
  • 1
DJJ
  • 2,481
  • 2
  • 28
  • 53