0

I have an executable Python 3.6 CGI script that is running on an Apache2 server, Ubuntu 18.04.

When the script tries to execute this line: print("<p>Something about latitude x°</p>"), it throws the error:

'ascii' codec can't encode character '\xb0' in position 203: ordinal not in range(128)

Even though the encoding is specified as UTF-8 in the HTML head with <meta charset="utf-8">.

When I try to force UTF-8 on the string with .encode('utf-8'), i.e.

print("<p>Something about latitude x°</p>".encode('utf-8')), the error disappears but the degree sign shows up as UTF8 hex

Something about latitude x\xc2\xb0

I tried setting an environment variable export PYTHONIOENCODING=UTF-8 in /etc/environment and created a global variable /etc/profile.d/python-encoding.sh, then reloaded both files using source, and restarted Apache2 server systemctl restart apache2, but to no avail. CGI version: 2.6.

Skyoxic
  • 21
  • 1
  • 5
  • you should use [HTML entity](https://www.freeformatter.com/html-entities.html) `°` in place of `°` and browser will display `°` in place of `°` - `"

    Something about latitude x°

    "`
    – furas Sep 11 '20 at 10:35
  • @furas thanks for the tip. This works only if the text was entered directly into the html, but in many cases the content is actually fetched from other servers so I need to solve the encoding issue in order to display the page correctly. – Skyoxic Sep 11 '20 at 11:38
  • how about `text.replace('°', '°')` ? Strange is that you can use `html.unescape('°')` to get `°` but `html.escape('°')` doesn't give `°`. I found only `'°'.encode('ascii', 'xmlcharrefreplace')` – furas Sep 11 '20 at 16:35
  • See also [Python CGI - UTF-8 doesn't work](https://stackoverflow.com/questions/14860034/python-cgi-utf-8-doesnt-work). – furas Sep 11 '20 at 16:42
  • @furas declaring the default charset and setting the environment variable for Python encoding in Apache2 .conf did the trick. Thanks man. – Skyoxic Sep 11 '20 at 17:17
  • you can mark your answer as accepted and later upvote it :) – furas Sep 11 '20 at 17:24

1 Answers1

1

Declaring the default charset and setting the environment variable for Python encoding in Apache2 .conf did the trick, as mentioned here: Python CGI - UTF-8 doesn't work

As pointed out by @furas.

Skyoxic
  • 21
  • 1
  • 5