1

I use logging library in python via conf file setting from this link https://realpython.com/python-logging/ and I write this codes:

log.conf.conf :

[loggers]
keys=root, sampleLogger

[handlers]
keys= consoleHandler, fileHandler

[formatters]
keys=fileFormatter, consoleFormatter

[logger_root]
level=DEBUG
handlers=fileHandler,consoleHandler

[logger_sampleLogger]
level=DEBUG
handlers=consoleHandler
qualname=sampleLogger
propagate=0

[handler_consoleHandler]
class=StreamHandler
level=INFO
formatter=consoleFormatter
args=(sys.stdout,)

[handler_fileHandler]
class=handlers.TimedRotatingFileHandler
interval=midnight
backupCount=5
formatter=fileFormatter
level=DEBUG
args=('../logs/log.log',)

[formatter_fileFormatter]
format=%(asctime)s - %(name)s - %(levelname)s - %(message)s

[formatter_consoleFormatter]
format=%(message)s 


main.py:

logging.config.fileConfig(fname='../configs/log_conf.conf',
                          disable_existing_loggers=False)
logger = logging.getLogger('main')

logger.info('Hello')
logger.info('سلام')

Logging string that contains English chars likes "Hello" works without any problem. But string "سلام" that contains Persian/Arabic chars raise exception:

--- Logging error ---
Traceback (most recent call last):
  File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\logging\__init__.py", line 1028, in emit
    stream.write(msg + self.terminator)
  File "C:\Users\user\AppData\Local\Programs\Python\Python37\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 40-43: character maps to <undefined>
Call stack:
  File "D:/Alireza/Code/addresstomaplocation/main/main.py", line 11, in <module>
    logger.info('سلام')
Message: 'سلام'
Arguments: ()

So I tried "utf-8" decoding and this way working but clearly logs file isn't readable for human:

logger.info('سلام'.encode('utf-8'))

outputs in log file:

2020-09-16 18:55:00,949 - main - INFO - b'\xd8\xb3\xd9\x84\xd8\xa7\xd9\x85'

My question is "There is any way for writing in logs file with Persian chars without encoding for human readability?"

Alireza Mazochi
  • 897
  • 1
  • 15
  • 22
  • `logger.info('%r', my_unicode_string)` - this will log the `repr` of the string rather than attempting to encode the characters. – snakecharmerb Sep 16 '20 at 15:19
  • But that probably won't be much more readable. It might be better to set your terminal to handle UTF-8 and set the `PYTHONIOENCODING` environment variable to `UTF-8` for your program. – snakecharmerb Sep 16 '20 at 15:37
  • @snakecharmerb , Thanks for helping! I tried `logger.info('%r', 'سلام'.encode('utf-8'))` and I got previous output: `b'\xd8\xb3\xd9\x84\xd8\xa7\xd9\x85'`. Another time I tried `logger.info('%r', 'سلام')` and I got exception. I don't find different between my code and your suggestion code. please get me an example or more explanation. – Alireza Mazochi Sep 16 '20 at 17:22
  • Yes. So the better plan is to run your code in a UTF-8 environment: see [how to use unicode characters in the windows command line](https://stackoverflow.com/questions/388490/how-to-use-unicode-characters-in-windows-command-line) and [Python, unicode and the Windows console](https://stackoverflow.com/questions/5419/python-unicode-and-the-windows-console) – snakecharmerb Sep 16 '20 at 18:53
  • Thanks again @snakecharmerb. I am working with pycharm IDE. This links speaks about cmd. Isn't there different between these environments? – Alireza Mazochi Sep 17 '20 at 06:49
  • I use neither PyCharm nor Windows, so I can't provide much detailed advice. However the `UnicodeEncodeError` happens because the encoding of target for the output (the PyCharm terminal emulator) can't encode the Persian characters. You could try setting `PYTHONIOENCODING` in the environment variable setting [here](https://www.jetbrains.com/help/pycharm/interactive-console.html#python-console-settings). – snakecharmerb Sep 17 '20 at 07:57
  • عليرضا اگه علاقه مند به همكاري هستي ؟ (تهران) – Iman Nia Sep 30 '20 at 19:02
  • @Iman , سلام. لطفا به من ایمیل بدین. از این طریق باهم صحبت کنیم. جیمیل من armazochi@gmail.com هستش – Alireza Mazochi Oct 04 '20 at 04:15

1 Answers1

3

I think the logging module is picking up cp1252 encoding on the console stream somehow. Setting the environment variable PYTHONIOENCODING=utf8 doesn't fix it, but if using Python 3.7 or later PYTHONUTF8=1 (forces UTF-8 defaults everywhere) made it work for me and I logged the following to the console (cmd.exe, with appropriate font):

Hello
سلام

and the following to the log file:

2020-09-17 13:52:51,169 - main - INFO - Hello
2020-09-17 13:52:51,170 - main - INFO - سلام

I don't have Pycharm, but the environment variable should work as long as you restart Pycharm after setting it.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251
  • Your suggestion way works for me! Thanks a lot my friend! – Alireza Mazochi Sep 18 '20 at 09:25
  • To other readers, this is a good link for setting PYTHONUTF8=1 : [link](https://stackoverflow.com/questions/50933194/how-do-i-set-the-pythonutf8-environment-variable-to-enable-utf-8-encoding-by-def) – Alireza Mazochi Sep 18 '20 at 09:28