17

I want to check that a string contains only one emoji, using Python 3. For example, there is a is_emoji function that checks that the string has only one emoji.

def is_emoji(s):
    pass

is_emoji("") #True
is_emoji("◼️") #False

I try to use regular expressions but emojis didn't have fixed length. For example:

print(len("◼️".encode("utf-8"))) # 6 
print(len("".encode("utf-8"))) # 4
Mike Müller
  • 82,630
  • 20
  • 166
  • 161
Siyanew
  • 542
  • 1
  • 4
  • 20

2 Answers2

35

You could try using this emoji package. It's primarily used to convert escape sequences into unicode emoji, but as a result it contains an up to date list of emojis.

from emoji import UNICODE_EMOJI

def is_emoji(s):
    return s in UNICODE_EMOJI

There are complications though, as sometimes two unicode code points can map to one printable glyph. For instance, human emoji followed by an "emoji modifier fitzpatrick type" should modify the colour of the preceding emoji; and certain emoji separated by a "zero width joiner" should be treated like a single character.

Dunes
  • 37,291
  • 7
  • 81
  • 97
  • 1
    This will check if the character is an emoji or not. Is there any way to check if a whole string contains an emoji? –  Nov 30 '20 at 15:26
  • 5
    Note that UNICODE_EMOJI has 4 keys representing supported languages. This means that instead of the return statement above, use `return s in UNICODE_EMOJI['en']` – jack1536 Apr 12 '21 at 07:51
  • 1
    To check if a whole string contains an emoji make a reg exp like 1. r = '|'.join(list(UNICODE_EMOJI['en'].keys())) 2. r = r.replace('|*', '|\*') 3. r = re.compile(r) – Nico Jun 16 '22 at 18:45
  • 2
    The `UNICODE_EMOJI` dictionary has been removed in 2.0. I would suggest using `bool(emoji.emoji_count(s))` – Matt Jan 27 '23 at 16:25
6

This works in Python 3:

def is_emoji(s):
    emojis = "◼️" # add more emojis here
    count = 0
    for emoji in emojis:
        count += s.count(emoji)
        if count > 1:
            return False
    return bool(count)

Test:

>>> is_emoji("")
True
>>> is_emoji('◼')
True
>>> is_emoji("◼️")
False

Combine with Dunes' answer to avoid typing all emojis:

from emoji import UNICODE_EMOJI

def is_emoji(s):
    count = 0
    for emoji in UNICODE_EMOJI:
        count += s.count(emoji)
        if count > 1:
            return False
    return bool(count)

This is not terrible fast because UNICODE_EMOJI contains nearly 1330 items, but it works.

Mike Müller
  • 82,630
  • 20
  • 166
  • 161