0

I have an Arabic unicode string that I want to print in Python (using Python(x,y) on Windows 7), but I can't get it to print, only the unicode representation is printed out.

The string is defined as:

ss = u'\u0647\u0630\u0627 \u0647\u0648 \u0627\u0644\u062d\u0644 \u0627\u0644\u0648\u062d\u064a\u062f \u0644\u0644\u0645\u0634\u0643\u0644\u0629 \u0627\u0644\u062a\u064a \u0646\u0648\u0627\u062c\u0647\u0647\u0627'

and should look like this: "هذا هو الحل الوحيد للمشكلة التي نواجهها"

When I try print, it gives the following error

print ss
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\encodings\cp1252.py", line 12, in encode
    return codecs.charmap_encode(input,errors,encoding_table)
UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-2: character maps to <undefined>

When I encode the string, say with cp1256 (Windows Arabic), it gives a wrong representation:

print ss.encode('cp1256')
åÐÇ åæ ÇáÍá ÇáæÍíÏ ááãÔßáÉ ÇáÊí äæÇÌååÇ

I have looked at several questions here related to printing unicode from Python on Windows, but nothing seems to work.

Any ideas?

Thanks.

UPDATE: I am using Spyder IDE (bundled with Python(x,y) on Windows 7).

UPDATE2: I already tried all the solutions in the "duplicate" questions, but none worked.

Mohamed Aly
  • 81
  • 1
  • 3
  • 8

1 Answers1

1

Your console is configured to display codepage 1252 (Latin 1), which indeed cannot handle your codepoints. Switch your console to a different codepage that can display the characters.

You could switch to 1256:

chcp 1256

or switch to 65001 (the UTF-8 codepage), which should be able to handle any unicode codepoint. You may have to switch the font used for your console though; Lucida Sans is reported to display most of Unicode.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343