1

I am trying to decode and ascii which is combined with string

example

g&#108bo&#115w&#111&#114t&#104

But i am not getting exact output

'g&#108bo&#115w&#111&#114t&#104'.decode("ascii")

output

u'g&#108bo&#115w&#111&#114t&#104'

if u remove this characters &# and try only with integers i get this

>>> chr(108)
'l'
>>> chr(115)
's'
>>> chr(111)
'o'
>>> chr(114)
'r'
>>> chr(104)
'h'

expected output

glbosworth

How can i decode this one "g&#108bo&#115w&#111&#114t&#104" to expected output

Mounarajan
  • 1,357
  • 5
  • 22
  • 43
  • 2
    Looks kind of like a string with weird randomly escaped HTML entities `html.unescape('glbosworth')` returns `'glbosworth'` – Jon Clements Sep 02 '17 at 12:50

2 Answers2

0

you are trying to decode html escaped string. you can use the html.unescape(s) function to do so (on python3):

import html
print(html.unescape('g&#108bo&#115w&#111&#114t&#104'))

outputs:

'glbosworth'

take a look at this so answer for more info

ShmulikA
  • 3,468
  • 3
  • 25
  • 40
0
  • on python3.6.x you can use html.unescape:

    import html
    print(html.unescape('g&#108bo&#115w&#111&#114t&#104'))
    
  • on python 2.x you can use HTMLParser:

    from HTMLParser import HTMLParser
    h = HTMLParser()
    print(h.unescape('g&#108bo&#115w&#111&#114t&#104'))
    
BPL
  • 9,632
  • 9
  • 59
  • 117