2

I am wondering how to extract all the emojis from text, then add them to a new column while removing them from the original text - if that makes sense.

For example, consider this data:

ID Text
1 This is good
2 Loving you so much ❤️
3 You make me sad!

This is my anticipated output:

ID Text Emoji
1 This is good
2 Loving you so much ❤️
3 You make me sad!

So far, I have tried this solution, but it has not worked for me, as it does not remove the emoji from the original text.

Any help on how to do this would be great.

Thanks!

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57

1 Answers1

0

Something along the following line should work for your purposes:

import pandas as pd
import emoji as emj

EMOJIS = emj.UNICODE_EMOJI["en"]

df = pd.DataFrame(
    data={
        "text": [
            "This is good ",
            "Loving you so much  ❤️",
            "You make me sad! ",
        ]
    }
)

def extract_emoji(df):
    df["emoji"] = ""
    for index, row in df.iterrows():
        for emoji in EMOJIS:
            if emoji in row["text"]:
                row["text"] = row["text"].replace(emoji, "")
                row["emoji"] += emoji

extract_emoji(df)
print(df.to_string())
           text                  emoji
0      This is good               
1      Loving you so much  ️       ❤️
2      You make me sad!           

Note that extract_emoji modifies the DataFrame in place.

felipe
  • 7,324
  • 2
  • 28
  • 37
  • Wow, thank you so, so much for that! It works! I really appreciate it! –  Jun 10 '21 at 09:30
  • ```EMOJIS = emj.UNICODE_EMOJI["en"]``` no longer works in [version 2.0.0](https://github.com/carpedm20/emoji/issues/221). Needs to be up-dated with ```EMOJIS = emj.EMOJI_DATA```. Function deletes text in original variable and does not extract emojis. Will post solution once found. [Possible solution here](https://stackoverflow.com/questions/43146528/how-to-extract-all-the-emojis-from-text) – Simone Jul 24 '23 at 16:11