0

I'm translating my Python application to french. I generated .po file but I have a problem that the french characters are displayed incorrectly.

Example:

exécution appears exÚcution

PS:I'm using gettext for translation.

even when I use chcp 1252 it doesn't work.. I'm using Pydev and when I tried to print my data to Pydev console it worked but it's not what I want : that's the way I add my handlers to the logger may be that's the problem :

if givesFileName:
        if FileName is None:
            print('Please specify an output Text File Name')
            # Exit with error
            sys.exit()
        #create file handler
        fh = logging.handlers.RotatingFileHandler(FileName, mode="w",encoding="utf-8")
        fh.setLevel(logging.DEBUG)
        logger.addHandler(fh) 
else:
        #create console handler
        ch = logging.StreamHandler()
        ch.setLevel(logging.DEBUG)
        logger.addHandler(ch)"
user3328690
  • 115
  • 2
  • 8
  • Most likely your file is not encoded as ISO-8859-1 then. Can you show us a hex dump or Python `repr()` output of some of the `.po` file lines? – Martijn Pieters Mar 17 '14 at 12:38
  • msgid "" msgstr "" "Last-Translator: \n" "Language-Team: French\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=iso-8859-1\n" "Content-Transfer-Encoding: utf-8\n" "Plural-Forms: nplurals=2; plural=(n > 1);\n" #: main.py:183 msgid "" "\n" " **Application ** \n" msgstr "\n **Application ** \n" #: main.py:184 msgid " Start: Function Name = Aggreg : \n" msgstr " Début: Exècution de la fonction Aggrég : \n" #: main.py:185 msgid " Retained Parametres : \n" msgstr " Les Paramètres retenus sont : \n" – user3328690 Mar 17 '14 at 12:45
  • Ah, no, your data is being **encoded** wrong again. How are you outputting your translated data? – Martijn Pieters Mar 17 '14 at 12:45
  • logging.info(_("my data")) – user3328690 Mar 17 '14 at 12:55

1 Answers1

0

Your data is being decoded to the IBM-850 codec, not Latin 1:

>>> print u'exécution'.encode('latin1').decode('ibm850')
exÚcution

This means your data is correctly read from the file, but on output your data is encoded to an incorrect codepage.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks for the answer so what should I do to fix that? – user3328690 Mar 17 '14 at 12:51
  • @user3328690: That depends on how your program outputs text. You'll need to provide us information about this. Is your app a web application? Desktop app? Console app? That's not clear here. – Martijn Pieters Mar 17 '14 at 12:54
  • my application is a console application and it outputs data:logging.info(_("my data")) – user3328690 Mar 17 '14 at 12:56
  • @user3328690: Is this on the Windows console? – Martijn Pieters Mar 17 '14 at 12:58
  • So your Windows console is misconfigured. Python is told to use the Latin 1 codec when writing to the console, but the console is then displaying the data written to the console using the IBM 850 codec. – Martijn Pieters Mar 17 '14 at 13:03
  • What happens when you run `chcp` in the console? What codec does Windows report it is using? – Martijn Pieters Mar 17 '14 at 13:04
  • it prints Active page code : 850 in fact I should tell you that my output depends on the user if he specifies a file I add to the logger a file handler if he doesn't specify a file I add a console handler when I try to print the output data to the file I get this error :UnicodeDecodeError: 'ascii' codec can't decode byte 0xe9 in position 18: ordinal not in range(128) and when I print it to console it's displayed incorrectly .. Thanks for your answers – user3328690 Mar 17 '14 at 13:12
  • @user3328690: Logging to a file assumes that you are encoding Unicode strings yourself, or are using a `io.open()` file object that encodes Unicode for you. – Martijn Pieters Mar 17 '14 at 13:13
  • @user3328690: Your incorrect output on the *console* means your console appears to be misconfigured. Python thinks it needs to encode to Latin 1, but it *should* encode to IBM850 instead. You can use `chcp 1252` in the console to fix this for now, I think. – Martijn Pieters Mar 17 '14 at 13:15
  • even chcp 1252 didn't fix that :-( – user3328690 Mar 17 '14 at 13:23
  • @user3328690: I see another post from someone with the same issue: [Python, windows console and encodings (cp 850 vs cp1252)](http://stackoverflow.com/q/9226516) – Martijn Pieters Mar 17 '14 at 13:24