1

This prints "Test: £17" when run from the local console, but only prints "Test: " when run from the web browser. How can I rectify the issue when loaded through the browser? Thanks!

#!/usr/bin/python3.2
print ("Content-Type: text/html")
print ("")

y = "£17"
print ("Test:", y)
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
stackity
  • 35
  • 5

1 Answers1

4

Python will encode unicode values to bytes when printing to the console.

Encode explicitly when sending to a browser, by writing directly to sys.stdout:

#!/usr/bin/python3.2
import sys
out = sys.stdout
out.write(b"Content-Type: text/html; charset=utf8\r\n")
out.write(b"\r\n")

y = "£17"
out.write("Test: {0}\r\n".format(y).encode(encoding='utf8'))

Note that HTTP headers should use a \r\n (carriage return, newline) combo, really. I've also added the encoding used to the Content-Type header so the browser knows how to decode it again.

For HTML, you really want to use character entity references instead of Unicode code points:

y = "£17"
out.write("Test: {0}\r\n".format(y).encode(encoding='utf8'))

at which point you could also just use ASCII as your encoding.

If you really, really, really want to use print(), then re-open stdout with the correct encoding:

utf8stdout = open(1, 'w', encoding='utf-8', closefd=False) # fd 1 is stdout

print("Content-Type: text/html; charset=utf8", end='\r\n', file=utf8stdout)
print("", end='\r\n', file=utf8stdout)

y = "£17"
print("Test:", y, end='\r\n', file=utf8stdout)

You could simplify that somewhat with functools.partial():

from functools import partial
utf8print = partial(print, end='\r\n', file=utf8stdout)

then use utf8print() without the extra keywords:

utf8print("Content-Type: text/html; charset=utf8")
utf8print("")
# etc.

Also see the Python Unicode HOWTO for details on how Python sets output encoding, as well as this question here on Stack Overflow about printing and encoding.

Community
  • 1
  • 1
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Many thanks! The last option worked for the test case and for what I am working on. I wanted to use the first suggestion but it didn't seem to be working right =/ This has been a huge help, and I learned a bunch too, and is going to impact the way I do several things. Thanks again. – stackity Dec 23 '12 at 00:47
  • it seems as if there are output maximums with this method, I'm troubleshooting now. If I use utf8print too much I get a server error. – stackity Dec 23 '12 at 06:48
  • @stackity: I doubt that anything maxes out; more likely you made a different error. :-) – Martijn Pieters Dec 23 '12 at 10:04
  • I came back to delete the comment (hoping that no one had responded), you're right, my shared hosting was limiting the amount of threads I could make in comparison to my localhost. My diagnosing was incorrect. Many thanks again! – stackity Dec 23 '12 at 12:40