1
{
"emoji_pattern": "([\\u1F1E0-\\u1F1FF\\u1F300-\\u1F5FF\\u1F600-\\u1F64F\\u1F680-\\u1F6FF\\u1F700-\\u1F77F\\u1F780-\\u1F7FF\\u1F800-\\u1F8FF\\u1F900-\\u1F9FF\\u1FA00-\\u1FA6F\\u1FA70-\\u1FAFF\\u2702-\\u27B0])"
}
emoji_pattern = re.compile(self.config['emoji_pattern'])

I want to detected emoji in text and I need t config with Json file, I don't know how Json read unicode-emoji

  • 1
    Does this answer your question? [How to extract all the emojis from text?](https://stackoverflow.com/questions/43146528/how-to-extract-all-the-emojis-from-text) – Tranbi Jul 12 '23 at 06:57
  • By detecting emoji, do you mean just emoji that are intended to be rendered as emoji, or are you including emoji rendered as text? Do you want to capture complex emji clusters as single units or are you just interested in the individual codepoints? Do you want non-emoji characters like variation selectors and ZWJ, etc included? – Andj Jul 15 '23 at 01:33

2 Answers2

0
import re

def detect_emojis(text):
    emoji_pattern = re.compile("["
        u"\U0001F600-\U0001F64F"  # emoticons
        u"\U0001F300-\U0001F5FF"  # symbols & pictographs
        u"\U0001F680-\U0001F6FF"  # transport & map symbols
        u"\U0001F1E0-\U0001F1FF"  # flags (iOS)
        u"\U00002600-\U000027BF"  # miscellaneous symbols
        u"\U0001F900-\U0001F9FF"  # supplemental symbols and pictographs
        u"\U0001F1F2-\U0001F1F4"  # country flags
        u"\U0001F1E6-\U0001F1FF"  # flags
        "]+", flags=re.UNICODE)
    emojis = re.findall(emoji_pattern, text)
    return emojis

# Example usage
text = "I love  and "
emojis = detect_emojis(text)
print(emojis)  # Output: ['', '']
Pythoner
  • 11
  • 2
  • It would be much better if you provide more supporting information to your answer. – user67275 Jul 12 '23 at 08:43
  • This approach would break complex emoji into individual components while losing other characters that form a complex emoji. It also excludes characters that could be emoji or text. – Andj Jul 15 '23 at 01:51
0

There's a module available on PyPi called emoji which you could consider installing. You can use that to identify any emojis in your string then replace them using the replace_emoji() function.

e.g.,

import emoji

s = 'Smile  and earth '

print(emoji.replace_emoji(s, ''))

Output:

Smile  and earth
DarkKnight
  • 19,739
  • 3
  • 6
  • 22