5

Just started working with R in Arabic as I plan to do text analysis and text mining with Hadith corpus. I have been reading threads related to my question but nevertheless, still can't manage to get the REAL basics here (sorry, absolute beginner).

So, I entered: textarabic.v <- scan("data/arabic-text.txt", encoding="UTF-8", what= "character",sep="\n")

And what comes out textarabic.v is of course, symbols (pic). Prior to this, I saved my text in utf-8 as I read in a thread but still nothing shows in Arabic.

I can type in Arabic R but scan brings the text in symbols.

enter image description here

Also read and tried to implement other user's are codes to make Arabic text function but I don't even know how and where to implement them. I added to R, tm and NLP packages.

What do you suggest for me to do next? Thanks in advance,

Esc6
  • 59
  • 2
  • Welcome to Stack Overflow! Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – zx8754 Mar 28 '17 at 11:20
  • 1
    Can I assume you are on Windows? If that's the case, I've had terrible experience with encodings. -nix OSes appear to handle (UTF8) quite well, though. – Roman Luštrik Mar 28 '17 at 11:21
  • I am Using OS X at the moment. – Esc6 Mar 28 '17 at 14:48

1 Answers1

1

I just posted an answer saying that you must definitely be using R on Windows before I saw your comment that you're on OSX. On OSX the situation is not quite so dire. The problem is that you're using too old a version of R. If I right remember, anything prior to 3.2 does not handle Unicode correctly. Try installing 3.3.3 from https://cran.r-project.org/bin/macosx/ and if necessary re-install the packages you need. Then you should be fine. بالتوفيق!

Sixtyfive
  • 1,150
  • 8
  • 19