Convert HTML Entity to Python Emoji

Question

Say I have the following HTML emoji entity: '&#x1f604 ;'

Note there isn't actually a space between the 4 and the ; it's just there so that it doesn't show up as a smiley

The emoji's Python form is: u"\U0001f604"

How do I convert all HTML emoji entities to their Python form?

Things I have tried so far:

Encode to utf-8
Unescape the text using HTML Parser and then convert
Use regex (couldn't get something that worked for all of the HTML emoji entities -- not as simple as swapping &#x with \U000 as that only works for some entities)

Possible duplicate: http://stackoverflow.com/questions/2087370/decode-html-entities-in-python-string — Robᵩ, Mar 04 '16 at 20:16
I agree that it is a duplicate. It turns out the solutions on that question did not work for me (I had looked at it before posting this) because Python 2.7.10's HTMLParser seems to be buggy — GangstaGraham, Mar 04 '16 at 20:32

score 5 · Accepted Answer · answered Mar 04 '16 at 20:12

5

HTMLParser.unescape does just that:

In [3]: HTMLParser.HTMLParser().unescape( '&#x1f604;' )
Out[3]: u'\U0001f604'

answered Mar 04 '16 at 20:12

Robᵩ

I don't get that output for some reason – GangstaGraham Mar 04 '16 at 20:13
>>> import HTMLParser >>> HTMLParser.HTMLParser().unescape( '😄' ) >>>'😄' – GangstaGraham Mar 04 '16 at 20:13
What version of Python are you using? – Robᵩ Mar 04 '16 at 20:14
version Python 2.7.10 – GangstaGraham Mar 04 '16 at 20:14
Then I don't know. The above test works with Python 2.7.6 on Ubuntu 14.04.3 LTS. – Robᵩ Mar 04 '16 at 20:18
Yea, I'm not sure either. I just checked it on CoderPad, and it worked as your example.Not sure why my version of Python's acting up – GangstaGraham Mar 04 '16 at 20:19
Maybe it's a bug with 2.7.10. Will try to update my Python version – GangstaGraham Mar 04 '16 at 20:21
Just ran it on my computer's Python3 with html.unescape('😄') and it worked. So I suspect that it's just my version of Python 2 (which I believe is the standard version of Python on a mac) – GangstaGraham Mar 04 '16 at 20:30
Thanks for your help! – GangstaGraham Mar 04 '16 at 20:30

1 Answers1