I have the following string for example (which was built as I realized from incorrectly encoded string)
This url could be properly decoded by browser showing the following:
Is there a way to unquote/decode the string so it's not presented like this:
https://ja-jp.facebook.com/åå¤å±‹ï½Šï½’ゲートタワーホテル-219123305237478
Browser shows url with same rubbish initially for a short time, but then without redirect it adjustst the string so it looks fine.
I'm trying to fix encoding with this simple code:
def fix_encoding(s):
for a in aliases:
for b in aliases:
try:
fixed = s.encode(a).decode(b)
except:
pass
else:
print (a, b)
print(fixed)
fix_encoding(u'åå¤å±‹ï½Šï½’ゲートタワーホテル-219123305237478')
The best results I've got are pretty close to what it should look like, but 2 first symbols are wrong for all same results. For ex.:
��屋jrゲートタワーホテル-219123305237478
('1252', 'l8')