8

R appears not to handle arabic text very well. Though it is possible to type some arabic strings like

Names <- c("سليم", "سعيد", "مجدى").

Now I use word or excel to write longer lists of name and save the file as text. I can import the file in R (RStudio) and display correctly the imported data. However, I can not manipulate the imported list. Plotting for example produces funny characters. why directly typed lists (not easy at all) can be plotted but not the imported list?

I am using windows 7, R v.3.0.2, and RStudio to read the file.

Any help on using arabic text in R will be appreciated. Thanks

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
Shabana
  • 155
  • 1
  • 6

1 Answers1

10

If you save you text with encoding 'UTF-8' ( For example using Rstudio, create a text file and then from menu use "Save with encoding..." and choose UTF-8), Then you can read it easily:

readLines('d:/temp/arabic.txt',encoding='UTF-8')
[1] "\"سليم\" \"سعيد\" \"مجدى\""

Or using scan:

scan("arabic",encoding='UTF-8',what='character',sep=',')
Read 3 items
[1] "سليم"    " سعيد"   " مجدى  "
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • Thanks for guidance. As such it works for small data. More accurate was scan(), as with readLines i had an error end of file... Since my data base on excel, i copied the arabic names to Notepad++ where i could formate the text correctly before saving the file. Thanks again. – Shabana Jan 20 '14 at 20:58
  • @Shabana عفوا :) you are welcome! For the readLines you should open the file and you type an enter at the end of the file(to add an extra line), this should fix the problem! – agstudy Jan 20 '14 at 22:35