0

I was debugging a Python 3 script and every time I tried to print a variable, it gave me the following error:

ipdb> inputs
*** UnicodeEncodeError: 'ascii' codec can't encode character '\u22f1' in
  position 314: ordinal not in range(12)

I tried to set the default encoding to UTF-8 using sys.setdefaultencoding() and adding # -*- coding: utf-8 -*- at the top of the script, but both didn't work.

Yamaneko
  • 3,433
  • 2
  • 38
  • 57

1 Answers1

0

TL;DR: export LANG=C.UTF-8


@mike explains that it happens because Python picks the encoding setting from the environment it's been initiated from. If it can't find a proper encoding, it falls back to its default, 'ascii'.

My solution was to change the locale as described in this answer. Thus, I tried to set export LANG=en_US.UTF-8 and, to my surprise, it didn't work. However, when I tried another locale, export LANG=C.UTF-8 as suggested by another answer, it worked.

Yamaneko
  • 3,433
  • 2
  • 38
  • 57
  • I'm having the same problem with `ipdb` and setting `LANG=C.UTF-8` correctly doesn't make any difference. I run my tests using this pre-command before `tox`: `eval $( ./dockerfiles/local_env.sh ) tox -e ....`, as you can imiagine `local_env.sh` contains some values I need to set to run my tests, but the point is that it already contains the right ones and I verify it with `eval $( ./dockerfiles/local_env.sh ) env | grep LANG` which returns me `LANG=C.UTF-8`, so it should work when `ipdb` kicks in, but it doesn't. – Andrea Grandi Oct 10 '19 at 06:25
  • And I want to add an additional thing: if I run this `os.environ.get('LANG')` while I'm inside the `ipdb` debugger, it prints `'C.UTF-8'`, so the value was correctly set. – Andrea Grandi Oct 10 '19 at 06:38
  • @AndreaGrandi interesting, I don't know what might be happening. It did take a couple of tries until I found one that worked for me, maybe you should try with other `LANG`s as well, but I'm not sure if it's due to the same issue. – Yamaneko Oct 12 '19 at 22:08