1

If you create this app.py

#!/usr/bin/python3
print("Content-Type: text/plain;charset=utf-8\n")
print("Hello World!")

and you enable CGI in .htaccess with:

Options +ExecCGI
AddHandler cgi-script .py

it totally works: http://example.com/app.py displays "Hello world!".

However if you add an accented character:

print("Quel été!")

this does not work anymore: the output page is empty in the browser.

Question: how to output a UTF8 content with Python3 + mod_cgi?

NB:

  • the .py file is saved with UTF8 encoding.

  • Might be related: https://redmine.lighttpd.net/issues/2471

  • Fun fact: running

    #!/usr/bin/python3
    import sys
    print("Content-Type: text/plain;charset=utf-8\n")
    print(str(sys.stdout.encoding))
    

    from command-line gives UTF-8 but running it trough mod_cgi outputs ANSI_X3.4-1968 in http://example.com/app.py.

    Setting export PYTHONIOENCODING=utf8 did not change anything, probably because it's Apache + mod_cgi that calls this Python code.

Basj
  • 41,386
  • 99
  • 383
  • 673
  • 1
    Probably there's an (encoding-related) exception which doesn't show in the browser (you'd expect an error 500, but I haven't ever used Python CGI, so I don't know). There's some environment-based heuristics involved in setting the encoding of the STD streams when the Python interpreter is started; see [here](https://docs.python.org/3/library/sys.html#sys.stdout). On a *nix platform, you might be able to fix this via the `PYTHONIOENCODING` env variable. – lenz May 19 '20 at 06:07
  • @lenz I just did a few tests, I edited the question (see last point); it seems that doing `export PYTHONIOENCODING=utf8` in bash did not change anything. Any idea? – Basj May 19 '20 at 06:38

1 Answers1

0

Easy solution

Just add:

SetEnv PYTHONIOENCODING utf8

in the .htaccess, along with:

Options +ExecCGI
AddHandler cgi-script .py

See also overwrite python3 default encoder when using apache server and Problème python apache et utf8. Also related but no answers did directly solve it: Set encoding in Python 3 CGI scripts.

Alternate solution

For Python 3 - 3.6 (see How to set sys.stdout encoding in Python 3?):

import sys, codecs
sys.stdout = codecs.getwriter("utf-8")(sys.stdout.detach())

For Python 3.7+:

sys.stdout.reconfigure(encoding='utf-8')

Note: even if it is sometimes cited in some answers, the following script does not work with Apache 2.4 + mod_cgi + Python3:

import locale                                  # Ensures that subsequent open()s 
locale.getpreferredencoding = lambda: 'UTF-8'  # are UTF-8 encoded.
import sys
sys.stdin = open('/dev/stdin', 'r')            # Re-open standard files in UTF-8 
sys.stdout = open('/dev/stdout', 'w')          # mode.
sys.stderr = open('/dev/stderr', 'w') 
print("Content-Type: text/plain;charset=utf-8\n")
print(str(sys.stdout.encoding))  # I still get: ANSI_X3.4-1968
Community
  • 1
  • 1
Basj
  • 41,386
  • 99
  • 383
  • 673