I would like to iterate over a string and output all emojis.
I'm trying to iterate over the characters, and check them against an emoji list.
However, python seems to split the unicode characters into smaller ones, breaking my code. Example:
>>> list(u'Test \U0001f60d')
[u'T', u'e', u's', u't', u' ', u'\ud83d', u'\ude0d']
Any ideas why u'\U0001f60d' gets split?
Or what's a better way to extract all emojis? This was my original extraction code:
def get_emojis(text):
emojis = []
for character in text:
if character in EMOJI_SET:
emojis.append(character)
return emojis