0

I have a source.txt file contain this lines: ËÇÈÊÉ ÇáÊØåíÑ ÇÓÊåáÇß

But in arabic is : ثابتة التطهير استهلاك

the text is in arabic and to show arabic i need to change the encoding manually in notepad++ from ANSI to WINDOWS-1256 I have a lot of files so i write a code in python:

with open("source.txt", 'r', encoding='ansi') as file_in:
    text = ""
    for line in file_in:
        text = text+line

with open("ARABIC.txt", 'w', encoding='cp1256') as f:
    f.write(text)

But i get this error message: Traceback (most recent call last): File "C:\Users\user\Desktop\arabiccode\SHOW_ARABIC.py", line 7, in f.write(text) File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python37_64\lib\encodings\cp1256.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 0-4: character maps to

  • 2
    You've misunderstood the problem. You can't convert your files to windows-1256 because they already *are* windows-1256. The issue is that your editor doesn't recognize that. The solution is to teach your editor better, or to *read* the files as windows-1256 and *write* them as UTF-8 (which everything, including Notepad++, will probably auto-detect okay). – hobbs Jun 16 '23 at 15:04
  • You face a [mojibake](https://en.wikipedia.org/wiki/Mojibake) case: `'ثابتة التطهير استهلاك'.encode('cp1256') == 'ËÇÈÊÉ ÇáÊØåíÑ ÇÓÊåáÇß'.encode('cp1252')` returns **`True`**… – JosefZ Jun 16 '23 at 18:03

0 Answers0