I am using anaconda Python 2.7 for Arabic text classification when I print words or list or words it appears in Unicode I want to print the real Arabic words the list contians [Arabic sentence, label]
from nltk.corpus.reader import CategorizedPlaintextCorpusReader reader = CategorizedPlaintextCorpusReader('mypath\\', r'(\w+)\.txt', cat_pattern=r'(\w+)\.txt',encoding='utf-8') document=reader.words('fileid')
documen[0]
output
[[u'\u0631\u0626\u064a\u0633', u'\u0627\u0644\u0628\u0631\u0644\u0645\u0627\u0646', ...], 'Politic']