5

I want to send an HTML page to the web browser encoded as UTF-8. However the following example fails:

from wsgiref.simple_server import make_server

def app(environ, start_response):
    output = "<html><body><p>Räksmörgås</p></body></html>".encode('utf-8')
    start_response('200 OK', [
        ('Content-Type', 'text/html'),
        ('Content-Length', str(len(output))),
    ])
    return output

port = 8000
httpd = make_server('', port, app)
print("Serving on", port)
httpd.serve_forever()

Here's the traceback:

Serving on 8000
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 75, in run
    self.finish_response()
  File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 116, in finish_response
    self.write(data)
  File "/Library/Frameworks/Python.framework/Versions/3.1/lib/python3.1/wsgiref/handlers.py", line 202, in write
    "write() argument must be a string or bytes"

If I remove the encoding and simply return the python 3 unicode string, the wsgiref server seems to encode in whatever charset the browser specifies in the request header. However I'd like to have this control myself as I doubt I can expect all WSGI servers to do the same. What should I do to return a UTF-8 encoded HTML page?

Thanks!

pthulin
  • 4,001
  • 3
  • 21
  • 23
  • Your output should be a unicode literal since (1) you're using non-ASCII characters in it and (2) encoding it. Probably not the cause of your current problem but it will bite you in the ass someday. – Max Shawabkeh Feb 01 '10 at 01:23
  • Are you referring to writing a string like u'Räksmörgås'? I don't need to do that as I'm in Python 3 – pthulin Feb 01 '10 at 09:00

3 Answers3

5

You need to return the page as a list:

def app(environ, start_response):
    output = "<html><body><p>Räksmörgås</p></body></html>".encode('utf-8')
    start_response('200 OK', [
        ('Content-Type', 'text/html; charset=utf-8'),
        ('Content-Length', str(len(output)))
    ])

    return [output]

WSGI is designed that way so that you could just yield the HTML (either complete or in parts).

AndiDog
  • 68,631
  • 21
  • 159
  • 205
0

edit

vim /usr/lib/python2.7/site.py

encoding = "ascii" # Default value set by _PyUnicode_Init()

to

encoding = "utf-8"

reboot system

para forcar o python 2.7 a trabalhar com utf-8 como padrão pois o mod_wsgi busca a codificacao padrao do python que antes era ascii com no maximo 128 caracteres!

  • 1
    Please consider translating the last paragraph into English, unless it is not too important/relevant. – Drew Dec 19 '13 at 16:58
0

AndiDog answer is correct, but in some enviroment you have to change app into application

def application(environ, start_response):
    output = "<html><body><p>Räksmörgås</p></body></html>".encode('utf-8')
    start_response('200 OK', [
        ('Content-Type', 'text/html; charset=utf-8'),
        ('Content-Length', str(len(output)))
    ])
    return [output]
Raf
  • 1
  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). – Yaron Jan 11 '18 at 08:29