2

I'm using pylons, and some of my urls contains non-English characters, such as:

http://localhost:5000/article/111/文章标题

At most cases, it won't be a problem, but in my login module, after a user has logging out, I try to get the referer from the request.headers, and redirect to that url.

if user_logout:
    referer = request.headers.get('referer', '/')
    redirect(referer)

Unforunately, if the url contains non-English characters, and with a brower of IE, it will report such an error (Firefox is OK):

  WebError Traceback:
  UnicodeDecodeError: 'ascii' codec can't decode byte 0xd5 in position 140: ordinal not in range(128) 
View as:   Interactive (full)  |  Text (full)  |  XML (full) clear this 
clear this 
URL: http://localhost:5000/users/logout

Module weberror.evalexception:431 in respond          view

There is a way to fix it(but no good), use urllib.quote() to convert the url before redirecting.

referer = quote_path(url) # only quote the path of the url
redirect(referer)

This is not a good solution, because it only works if the brower is IE, and very boring. Is there any good solution?

Freewind
  • 193,756
  • 157
  • 432
  • 708

3 Answers3

1

Try checking the RFC for non-ascii URLs. They are converted to an ascii equivalent if I remember correctly. You could then redirect to that.

Edit: According to @ssokolov (see comments below):

The specific terms to look up are IDN (Internationalized Domain Names) and Punycode

Daren Thomas
  • 67,947
  • 40
  • 154
  • 200
1

At last, I still not find a good solution, and use this code:

referer = urllib.quote(referer, '.:/?=;-%#')

It seems work well now, but I don't feel safe.

Freewind
  • 193,756
  • 157
  • 432
  • 708
  • This is the right way of doing it, you should encode every non ASCII character in URL. It's in HTTP specification. – giolekva Sep 07 '10 at 07:11
0

Redirect works by raising an exception. This is caught and converted into an HTTP response How about specifying the charset for your response?

response.charset = 'utf8'

pyfunc
  • 65,343
  • 15
  • 148
  • 136
  • The bad news is: I have set `response.charset='utf8'`, even `request.charset='utf8'` – Freewind Sep 02 '10 at 06:06
  • That means the url itself is not getting encoded correctly. The fix you have provided can be extended. referer = quote_plus(url.encode('utf8')) This should enode the non-characters and then quoting the url should work. – pyfunc Sep 02 '10 at 07:18