0

I have this string in var1

var1 = '$a=1%7Cscroll%20on%20%22Page%3A%20Generator-Sets-Construction%3Fid%3Dci%26s%3DY2l8Tj00Mjk0NzQ4MDY5KzQyOTQ5NjM4OTY%3D%22%7C-%7Cscroll%7C1443616500011%7C1443616500586%7C3774$fId=16440287_806$rId=RID_-62268720$rpId=1762047089$domR=1443616443684$time=1443616500588'

How can I change the contents of the string into 'readable' text i.e. non-URL encoded.

From research, here is the code I have tried, but it still keeps the URL-encoded items e.g. %20 etc.

import html
print(html.unescape('$a=1%7Cscroll%20on%20%22Page%3A%20Generator-Sets-      Construction%3Fid%3Dci%26s%3DY2l8Tj00Mjk0NzQ4MDY5KzQyOTQ5NjM4OTY%3D%22%7C-%7Cscroll%7C1443616500011%7C1443616500586%7C3774$fId=16440287_806$rId=RID_-62268720$rpId=1762047089$domR=1443616443684$time=1443616500588'))

All help is appreciated or if there is an existing module that does this.

R. Yora
  • 31
  • 9
  • Duplicate of this [question](http://stackoverflow.com/questions/8136788/decode-escaped-characters-in-url)? – strubbly Oct 01 '15 at 09:36

1 Answers1

1

What you are trying to do is unquoting of parameters string and not unescaping of html. Following should work -

import urllib.parse
print(urllib.parse.unquote('$a=1%7Cscroll%20on%20%22Page%3A%20Generator-Sets-      Construction%3Fid%3Dci%26s%3DY2l8Tj00Mjk0NzQ4MDY5KzQyOTQ5NjM4OTY%3D%22%7C-%7Cscroll%7C1443616500011%7C1443616500586%7C3774$fId=16440287_806$rId=RID_-62268720$rpId=1762047089$domR=1443616443684$time=1443616500588'))
Vikas Ojha
  • 6,742
  • 6
  • 22
  • 35