urllib.unquote not properly decoding url

Question

I am able to do the following in the python shell:

>>> import urllib
>>> s='https://www.microsoft.com/de-at/store/movies/american-pie-pr%C3%A4sentiert-nackte-tatsachen/8d6kgwzl63ql'
>>> print urllib.unquote(s)
https://www.microsoft.com/de-at/store/movies/american-pie-präsentiert-nackte-tatsachen/8d6kgwzl63ql

However, if I do this within a python program, it improperly decodes the url:

url = res.history[0].url if res.history else res.url
print '1111', url
print '2222', urllib.unquote(url)

111 https://www.microsoft.com/de-at/store/movies/american-pie-pr%C3%A4sentiert-nackte-tatsachen/8d6kgwzl63ql
222 https://www.microsoft.com/de-at/store/movies/american-pie-prÃ¤sentiert-nackte-tatsachen/8d6kgwzl63ql

Why isn't this being properly decoded in the program but it is in my python shell?

Try to add a the line `# -*- coding: utf-8 -*-` at the top of the file to see if it helps. — Hai Vu, Dec 27 '15 at 05:27
[Why did you post this question two times?](http://stackoverflow.com/questions/34477648/urldecoding-requests) I can't see any different. — Remi Guan, Dec 27 '15 at 06:14

score 1 · Accepted Answer · answered Dec 27 '15 at 05:33

The following worked to fix the issue:

url = urllib.unquote(str(res.url)).decode('utf-8', 'ignore')

res.url was a unicode string, but didn't seem to work well with urllib.unquote. So the solution was to first convert it to a string (like how it was in the python interpreter) and then decode it into Unicode.

urllib.unquote not properly decoding url

1 Answers1

Linked