2

I'm trying to get a Python 3.4 cgi script and Apache to output an 'ü' character in the browser (same problem occurs for any other Unicode character, for that matter). The python 3.4 cgi script causes a UnicodeEncodeError in Apache while a similar python 2.7 code works fine on the same server. Both scripts 3.4 and 2.7 work fine from the command line.

This is the error that I get while running the python 3.4 script:

UnicodeEncodeError: 'ascii' codec can't encode character '\xfc' in position 23: ordinal not in range(128)

Here's the code that causes that error:

#!/usr/local/bin/python3
# -*- coding: utf-8 -*-

print ("Content-Type: text/html; charset=utf-8\n\n")
print ("""\
<html>
<head>
<meta charset="UTF-8">
</head>
<body>
""")

print ("U umlaut (Python 3.4): ü<br>")

print ("""\
</body>
</html>
""")

The Python 2.7 script below on the same server displays ü and any other Unicode characters correctly: (so it's not an Apache problem?)

#!/usr/bin/python
# -*- coding: utf-8 -*-

print "Content-Type: text/html; charset=utf-8\n\n"
print """\
<html>
<head>
<meta charset="UTF-8">
</head>
<body>
"""

print "U umlaut (Python 2.7): ü<br>"

print """\
</body>
</html>
"""

Both scripts work correctly from the command line. I already have

AddDefaultCharset UTF-8

in my httpd.conf.

Also, my locale variables are set as follows:

LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL="en_US.UTF-8"

A already included the UTF-8 setting everywhere I could think of (sometimes excessively). Does anyone know what else I can do to get the python 3.4 script display Unicode characters correctly in the browser? Thanks.

nclas_
  • 21
  • 2

1 Answers1

1

I know it is a few moons since your question but i stumbled over it facing the same problem. And I found a solution. Maybe it will not help you but other seekers.

Jack O'Connor's solution fixed the problem for me just try this:

import sys
sys.stdout = open(sys.stdout.fileno(), mode='w', encoding='utf8', buffering=1)
print("日本語")
# Also works with other methods of writing to stdout:
sys.stdout.write("日本語\n")
sys.stdout.buffer.write("日本語\n".encode())`
Joe Platano
  • 586
  • 1
  • 14
  • 27