How to extract all the emojis from text?

Question

Consider the following list:

a_list = ['  me así, bla es se  ds ']

How can I extract in a new list all the emojis inside a_list?:

new_lis = ['     ']

I tried to use regex, but I do not have all the possible emojis encodings.

Linking in http://stackoverflow.com/q/26568722/674039 and http://stackoverflow.com/q/35404144/674039 — wim, Mar 31 '17 at 18:08

score 93 · Accepted Answer · edited Jun 08 '21 at 13:25

93

You can use the emoji library. You can check if a single codepoint is an emoji codepoint by checking if it is contained in emoji.UNICODE_EMOJI.

import emoji

def extract_emojis(s):
  return ''.join(c for c in s if c in emoji.UNICODE_EMOJI['en'])

edited Jun 08 '21 at 13:25

Matteo

67
7

answered Mar 31 '17 at 17:39

Pedro Castilho

10,174
2
28
39

You can download the list of emoji in string/int format present in **#EmojiCodeSheet** [here](https://github.com/shanraisshan/EmojiCodeSheet), for custom comparator. – shanraisshan Apr 06 '17 at 06:33
1

your code cannot detect flags in the text : extract_emojis(" ") – Nomiluks Mar 14 '18 at 11:30
@NomanDilawar that is because my code iterates over every character. Unicode flags are a combination of two "regional indicator" characters which are not, individually, emoji. If you want to detect Unicode flags you'll need to check pairs of characters. – Pedro Castilho Mar 16 '18 at 21:21
@Nomiluks I had to filter it either per language or do a recursive dictionary search. '' in emoji.UNICODE_EMOJI['en'] – msarafzadeh Feb 25 '21 at 10:59
1

Doesn't work in Python 3.6? I get an empty string. – Jesse Aldridge Mar 19 '21 at 00:23
The answer has been updated to include ['en']. It should work again now. – Matteo Jun 08 '21 at 15:02
4

the emoji.UNICODE_EMOJI is now changed. I found it error while doing the same task. One could use emoji.distinct_emoji_list(test) where test is the string. – Aminur Rahman Ashik Jul 31 '22 at 08:48
2

`AttributeError: module 'emoji' has no attribute 'UNICODE_EMOJI' ` – Umair Ayub Sep 25 '22 at 09:39
```EMOJIS = emj.UNICODE_EMOJI["en"]``` no longer works in version 2.0.0. Needs to be up-dated with ```EMOJIS = emj.EMOJI_DATA```. See [reference](https://carpedm20.github.io/emoji/docs/api.html) Even when I up-date function accordingly it returns an empty list. Not sure why this solution has the most votes. – Simone Jul 25 '23 at 09:08
If the text containing emojis is stored in a dataframe and one wants to add another column to the DF with only the emojis I wrote the following code based on [this solution](https://stackoverflow.com/questions/63762570/extract-emoji-from-series-of-text) ```def extract_emojis(text): return ''.join(c for c in text if c in EMOJIS)``` and then ```test['emoji'] = test['text'].apply(extract_emojis)``` --> deliberately not adding another solution – Simone Jul 25 '23 at 09:48

score 47 · Answer 2 · edited Jun 30 '21 at 15:55

47

I think it's important to point out that the previous answers won't work with emojis like ‍‍‍ , because it consists of 4 emojis, and using ... in emoji.UNICODE_EMOJI will return 4 different emojis. Same for emojis with skin color like .

My solution

Include the emoji and regex modules. The regex module supports recognizing grapheme clusters (sequences of Unicode codepoints rendered as a single character), so we can count emojis like ‍‍‍

import emoji
import regex

def split_count(text):

    emoji_list = []
    data = regex.findall(r'\X', text)
    for word in data:
        if any(char in emoji.UNICODE_EMOJI['en'] for char in word):
            emoji_list.append(word)
    
    return emoji_list

Testing

with more emojis with skin color:

line = ["  me así, se  ds  hello ‍ emoji hello ‍‍‍ how are  you today"]

counter = split_count(line[0])
print(' '.join(emoji for emoji in counter))

output:

      ‍ ‍‍‍

Include flags

If you want to include flags, like the Unicode range would be from to , so add:

flags = regex.findall(u'[\U0001F1E6-\U0001F1FF]', text)

to the function above, and return emoji_list + flags.

See this answer to "A python regex that matches the regional indicator character class" for more information about the flags.

For newer `emoji` versions

to work with emoji >= v1.2.0 you have to add a language specifier (e.g. en as in above code):

emoji.UNICODE_EMOJI['en']

edited Jun 30 '21 at 15:55

hc_dev

8,389
1
26
38

answered Mar 12 '18 at 19:05

sheldonzy

5,505
9
48
86

Your code is working good, but how can we handle flags? " " – Nomiluks Mar 14 '18 at 11:31
@NomanDilawar Hi, sorry for the delay. I edited my answer. I ran some tests and it seems to work fine now. – sheldonzy Mar 23 '18 at 12:46
`UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal if any(char in emoji.UNICODE_EMOJI for char in word):` is what I am getting. – kingmakerking May 31 '18 at 13:40
1

This is the only solution that I found to work comprehensively for all emojis I've encountered so far. – Paulo Malvar Apr 11 '19 at 20:58
You can replace `print(' '.join(emoji for emoji in counter))` with `print(' '.join(counter))`. Does the same thing. – Amir Shabani Jan 06 '20 at 17:04
3

Also, I think it's better to write `for grapheme in data:` instead of `for word in data:` as it reflects the purpose of `\X` better. – Amir Shabani Jan 06 '20 at 17:16
4

As of emoji v.1.2.0, the check must include a language specifier, e.g. `any(char in emoji.UNICODE_EMOJI["en"] for char in grapheme)` – Alex Feb 18 '21 at 10:28
2

`emoji.UNICODE_EMOJI['en']` has been deprecated for emoji >= 2.0.0. Instead compare using `word in emoji.EMOJI_DATA`. – Destaq Jan 28 '23 at 22:55

score 13 · Answer 3 · edited Feb 02 '21 at 17:15

13

import emojis
new_list = emojis.get('  me así, bla es se  ds ')
print(new_list)

output>>>{'', '', '', '', '', ''}

edited Feb 02 '21 at 17:15

yudhiesh

6,383
3
16
49

answered Sep 15 '20 at 20:52

Ganesh

131
1
2

ModuleNotFoundError: No module named 'emojis' – aswzen Sep 28 '22 at 19:45
@aswzen it worked for me though. – Skapis9999 Jan 24 '23 at 17:24
mine need to call pip install emojis first – aswzen Jan 25 '23 at 07:36
neat solution. With the data I have it ```emojis.get``` doesn't recognise all emojis, but ```emoji.demojize``` probably does. Did a cross-validation for the emoji recognition with ```advertools.extract_emoji```. [Advertools](https://advertools.readthedocs.io/en/master/) recognises less emojis than [emoji](https://carpedm20.github.io/emoji/docs/index.html) – Simone Jul 25 '23 at 08:52
Checking the [reference](https://carpedm20.github.io/emoji/docs/api.html) it seems that ```emoji.get()``` is deprecated. – Simone Jul 25 '23 at 09:09

Mazdak · Answer 4 · 2017-04-01T05:47:47.083

11

If you don't want to use an external library, as a pythonic way you can simply use regular expressions and re.findall() with a proper regex to find the emojies:

In [74]: import re
In [75]: re.findall(r'[^\w\s,]', a_list[0])
Out[75]: ['', '', '', '', '', '']

The regular expression r'[^\w\s,]' is a negated character class that matches any character that is not a word character, whitespace or comma.

As I mentioned in comment, a text is generally contain word characters and punctuation which will be easily dealt with by this approach, for other cases you can just add them to the character class manually. Note that since you can specify a range of characters in character class you can even make it shorter and more flexible.

Another solution is instead of a negated character class that excludes the non-emoji characters use a character class that accepts emojies ([] without ^). Since there are a lot of emojis with different unicode values, you just need to add the ranges to the character class. If you want to match more emojies here is a good reference contain all the standard emojies with the respective range for different emojies http://apps.timwhitlock.info/emoji/tables/unicode:

edited Apr 01 '17 at 05:47

answered Mar 31 '17 at 18:20

Mazdak

105,000
18
159
188

1

That works for this particular input, but there are plenty of other non-emoji characters that don't fall under the categories of `\w`, `\s`, or comma. – user2357112 Apr 01 '17 at 04:18
@user2357112 A text is generally contain word characters and punctuation which will be easily dealt with by this approach, for other cases you can just add them to the character class manually.. Note that since you can specify a range of characters in character class you can even make it shorter and more flexible. – Mazdak Apr 01 '17 at 05:04
1

Your regex fails on all non-comma punctuation, among other things. – user2357112 Apr 01 '17 at 05:11
@user2357112 Well that's what I said. You can add them to the character class if you want. You don't have to include all the cases always, its relative and based on the text that you're dealing with. – Mazdak Apr 01 '17 at 05:13
10

Manually adding every non-emoji character from your text to your regex is a terrible, bloaty, error-prone solution. – user2357112 Apr 01 '17 at 05:36
@user2357112 Maybe, just in case that your text contains all of those characters. Nevertheless, just for the sake of completeness I updated the answer with another way which is using the range of emojies and character class instead of excluding non-emojies. – Mazdak Apr 01 '17 at 05:49
I just needed to do a quick search of a code base and the following got what I needed: [^\w\s,;"{}='!*:\./[\[\]\-\$#<>&@|\^`\+\?\\~%‘’£₵,€] Not scalable, but just in case anyone else finds it useful. – Henry Munro Feb 10 '22 at 14:15

score 7 · Answer 5 · answered Nov 01 '17 at 21:43

The top rated answer does not always work. For example flag emojis will not be found. Consider the string:

s = u'Hello \U0001f1f7\U0001f1fa hello'

What would work better is

import emoji
emojis_list = map(lambda x: ''.join(x.split()), emoji.UNICODE_EMOJI.keys())
r = re.compile('|'.join(re.escape(p) for p in emojis_list))
print(' '.join(r.findall(s)))

score 5 · Answer 6 · answered May 25 '18 at 13:17

Step 1: Make sure that your text it's decoded on utf-8 text.decode('utf-8')

Step 2: Locate all emoji from your text, you must separate the text character by character [str for str in decode]

Step 3: Saves all emoji in a list [c for c in allchars if c in emoji.UNICODE_EMOJI] full example bellow:

>>> import emoji
>>> text     = "  me así, bla es se  ds "
>>> decode   = text.decode('utf-8')
>>> allchars = [str for str in decode]
>>> list     = [c for c in allchars if c in emoji.UNICODE_EMOJI]
>>> print list
[u'\U0001f914', u'\U0001f648', u'\U0001f60c', u'\U0001f495', u'\U0001f46d', u'\U0001f459']

if you want to remove from text

>>> filtred  = [str for str in decode.split() if not any(i in str for i in list)]
>>> clean_text = ' '.join(filtred)
>>> print clean_text
me así, bla es se ds

Phani Rithvij · Answer 7 · 2019-10-31T11:37:23.573

Another way to do it using emoji is to use emoji.demojize and convert them into text representations of emojis.

Ex: will be converted to :grinning_face: etc..

Then find all :.*: patterns, and use emoji.emojize on those.

# -*- coding: utf-8 -*-
import emoji
import re

text = """
Of course, too many emoji characters \
 like , #@^!*&#@^#  helps  people read aaaaaa #douchebag
"""

text = emoji.demojize(text)
text = re.findall(r'(:[^:]*:)', text)
list_emoji = [emoji.emojize(x) for x in text]
print(list_emoji)

This might be a redundant way but it's an example of how emoji.emojize and emoji.demojize can be used.

score 3 · Answer 8 · edited May 24 '18 at 21:13

The solution to get exactly what tumbleweed ask, is a mix between the top rated answer and user594836's answer. This is the code that works for me in Python 3.6.

import emoji
import re

test_list=['  me así,bla es,se  ds ']

## Create the function to extract the emojis
def extract_emojis(a_list):
    emojis_list = map(lambda x: ''.join(x.split()), emoji.UNICODE_EMOJI.keys())
    r = re.compile('|'.join(re.escape(p) for p in emojis_list))
    aux=[' '.join(r.findall(s)) for s in a_list]
    return(aux)

## Execute the function
extract_emojis(test_list)

## the output
['     ']

mohammad karami sheykhlan · Answer 9 · 2020-05-16T16:27:47.243

First of all you need to install this:

conda install -c conda-forge emoji

Now we can write the following code:

import emoji
import re
text= '  me así, bla es se  ds '
text_de= emoji.demojize(text)

If we print text_de Output is:

':thinking_face: :see-no-evil_monkey: me así, bla es se :relieved_face: ds 
 :two_hearts::two_women_holding_hands::bikini:'

Now we can use regex to find emojis.

emojis_list_de= re.findall(r'(:[!_\-\w]+:)', text_de)
list_emoji= [emoji.emojize(x) for x in emojis_list_de]

If we print lis_emoji, output:

['', '', '', '', '', '']

So, we can use Join function:

[''.join(list_emoji)]
OutPut: ['']

If you want to remove emojis you can use following code:

def remove_emoji(text):
   '''
   remove all of emojis from text
   -------------------------
   '''
   text=  emoji.demojize(text)
   text= re.sub(r'(:[!_\-\w]+:)', '', text)

   return text

score 2 · Answer 10 · edited Sep 17 '21 at 04:22

Ok - i had this same problem and I worked out a solution which doesn't require you to import any libraries (like emoji or re) and is a single line of code. It will return all the emojis in the string:

def extract_emojis(sentence):
    return [word for word in sentence.split() if str(word.encode('unicode-escape'))[2] == '\\' ]

This allowed me to create a light-weight solution and i hope it helps you all. Actually - i needed one which would filter out any emojis in a string - and thats the same as the code above but with one minor change:

def filter_emojis(sentence):
        return [word for word in sentence.split() if str(word.encode('unicode-escape'))[2] != '\\' ]

Here is an example of it in action:

 >>> a = '  me así, bla es se  ds '
 >>> b = extract_emojis(a)
 >>> b
 ['', '', '', '']

Thank you! Out of all the responses on the page, this worked the best — Samuelf80, Feb 25 '21 at 14:08

score 2 · Answer 11 · answered Jan 31 '19 at 01:21

from emoji import *

EMOJI_SET = set()

# populate EMOJI_DICT
def pop_emoji_dict():
    for emoji in UNICODE_EMOJI:
        EMOJI_SET.add(emoji)

# check if emoji
def is_emoji(s):
    for letter in s:
        if letter in EMOJI_SET:
            return True
    return False

This is a better solution when working with large datasets since you dont have to loop through all emojis each time. Found this to give me better results :)

carlsky · Answer 12 · 2023-07-27T01:13:37.650

2

Here's another option that uses emoji.get_emoji_regexp() and re:

import re
import emoji

# This works for `emoji` version <2.0
def extract_emojis(text):
    return re.findall(emoji.get_emoji_regexp(), text)

test_str = ' some  various  emojis ‍ and  flags ‍‍‍'
emojis = extract_emojis(test_str)

This yields:

['', '', '', '\u200d', '', '\u200d\u200d\u200d']

Or, to view the grapheme clusters:

print(' '.join(emoji for emoji in emojis))

Yields

   ‍  ‍‍‍

Newer `emoji` versions

For versions of emoji>=2.0.0, there's no need for re:

def extract_emojis(text):
    return [x.chars for x in emoji.analyze(test_str)]

edited Jul 27 '23 at 01:13

answered May 30 '21 at 21:59

carlsky

78
1
7

```emoji.get_emoji_regexp()``` no longer works with V2.0.0. See [manual](https://carpedm20.github.io/emoji/docs/index.html). One can try to downgrade to an older version, but then the emojis list from the package is probably going to be an older one too. – Simone Jul 25 '23 at 09:40

score 0 · Answer 13 · answered Mar 19 '19 at 09:29

This function expects a string so converting the list of input to string

a_list = '  me así, bla es se  ds '

# Import the necessary modules
from nltk.tokenize import regexp_tokenize

# Tokenize and print only emoji
emoji = "['\U0001F300-\U0001F5FF'|'\U0001F600-\U0001F64F'|'\U0001F680- 
 \U0001F6FF'|'\u2600-\u26FF\u2700-\u27BF']"

print(regexp_tokenize(a_list, emoji)) 

output :['', '', '', '', '']

Scott Weaver · Answer 14 · 2021-10-27T05:33:32.880

If a library seems like overkill, try this regular expression - it works by matching the longest emojis first in a big alternation. Parses all emojis, all skin tones, and all flags. (v14.0) more info

# coding=utf8
import re
a_list = ['  me así, bla es se  ds ']
ret = re.findall(r'(?:‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍||||‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍|‍❤️‍‍|‍❤️‍‍|‍❤️‍‍|‍‍‍|‍‍‍|‍‍‍|‍‍‍|‍‍‍|‍‍‍|‍‍‍|‍‍‍|‍‍‍|‍‍|‍❤️‍|‍❤️‍|‍❤️‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|‍‍|️‍️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍⚕️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍⚖️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍✈️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍♂️|‍♂️|‍♂️|‍♂️|‍♂️|‍♀️|‍♀️|‍♀️|‍♀️|‍♀️|‍️|️‍♂️|️‍♀️|️‍♂️|️‍♀️|️‍♂️|️‍♀️|️‍|️‍⚧️|⛹‍♂️|⛹‍♂️|⛹‍♂️|⛹‍♂️|⛹‍♂️|⛹‍♀️|⛹‍♀️|⛹‍♀️|⛹‍♀️|⛹‍♀️|‍|‍|❤️‍|❤️‍|‍♂️|‍♀️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍♀️|‍♂️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍⚕️|‍⚕️|‍⚕️|‍|‍|‍|‍|‍|‍|‍⚖️|‍⚖️|‍⚖️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍✈️|‍✈️|‍✈️|‍|‍|‍|‍|‍|‍|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍|‍|‍|‍|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍|‍|‍|‍|‍|‍|‍|‍|‍|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|⛹️‍♂️|⛹️‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍♂️|‍♀️|‍|‍|‍|‍|‍|‍❄️|‍☠️|‍⬛|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||#️⃣|0️⃣|1️⃣|2️⃣|3️⃣|4️⃣|5️⃣|6️⃣|7️⃣|8️⃣|9️⃣|✋|✋|✋|✋|✋|✌|✌|✌|✌|✌|☝|☝|☝|☝|☝|✊|✊|✊|✊|✊|✍|✍|✍|✍|✍|⛹|⛹|⛹|⛹|⛹||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||☺|☹|☠|❣|❤|✋|✌|☝|✊|✍|⛷|⛹|☘|☕|⛰|⛪|⛩|⛲|⛺|♨|⛽|⚓|⛵|⛴|✈|⌛|⏳|⌚|⏰|⏱|⏲|☀|⭐|☁|⛅|⛈|☂|☔|⛱|⚡|❄|☃|⛄|☄|✨|⚽|⚾|⛳|⛸|♠|♥|♦|♣|♟|⛑|☎|⌨|✉|✏|✒|✂|⛏|⚒|⚔|⚙|⚖|⛓|⚗|⚰|⚱|♿|⚠|⛔|☢|☣|⬆|↗|➡|↘|⬇|↙|⬅|↖|↕|↔|↩|↪|⤴|⤵|⚛|✡|☸|☯|✝|☦|☪|☮|♈|♉|♊|♋|♌|♍|♎|♏|♐|♑|♒|♓|⛎|▶|⏩|⏭|⏯|◀|⏪|⏮|⏫|⏬|⏸|⏹|⏺|⏏|♀|♂|⚧|✖|➕|➖|➗|♾|‼|⁉|❓|❔|❕|❗|〰|⚕|♻|⚜|⭕|✅|☑|✔|❌|❎|➰|➿|〽|✳|✴|❇|©|®|™|ℹ|Ⓜ|㊗|㊙|⚫|⚪|⬛|⬜|◼|◻|◾|◽|▪|▫)', a_list[0])
print(ret)
#['', '', '', '', '', '']

score 0 · Answer 15 · answered May 13 '22 at 13:49

Building on Mohammed Terry Jack answer which only works where each emoji is separated by a space. See a modified version below which has removed this requirement:

def extract_emojis(sentence):     
    return [sentence[i] for i in range(len(sentence)) if str(sentence[i].encode('unicode-escape'))[2] == '\\' ]

Expected result:

 >>> a = '  me así, bla es se  ds '
 >>> b = extract_emojis(a)
 >>> b
 ['', '', '', '', '', '']

score -5 · Answer 16 · answered Mar 31 '17 at 17:37

-5

All the Unicode emojis with their respective code points are here. They are 1F600 to 1F64F, so you can just build all of them with a range-like iterator.

answered Mar 31 '17 at 17:37

patrick

4,455
6
44
61

1

That's only one particular range of emoji. There are a lot more. – user2357112 Apr 01 '17 at 04:17

How to extract all the emojis from text?

16 Answers16

My solution

Testing

Include flags

For newer `emoji` versions

Newer `emoji` versions

Linked

Related

How to extract all the emojis from text?

16 Answers16

My solution

Testing

Include flags

For newer emoji versions

Newer emoji versions

Linked

Related

For newer `emoji` versions

Newer `emoji` versions