We are cleansing some marketing data in traditional Chinese. We found R can read UTF-8 traditional Chinese variable names without any problem. However, we can not get valid UTF-8 output there. For example,
If we command: unique(rframe$性別)
This is what we got: [1] "\u5973" "\u7537"
In which 性別 is "gender," \u5973 means female (女), and \u7537 means male (男).
The most interesting thing is R on the Linux platform generates the valid UTF-8 Chinese output if we use the same UTF-8 CSV file. Why does the same RStudio, which can generate Chinese output encoding in UTF-8 on the Linux platform successfully, cannot output valid UTF-8 Chinese output on the Mac system?
This very troublesome issue has been there for a long while. In fact, in the older RStudio version, we could get valid UTF-8 output. Can any friend help us?
Much obliged.
Chandler