I am trying to extract emojis from a sentence and add them into a new column, but when I do it, the new column just contains nothing, in which the emojis are still in the sentence.
For reference, my dataset looks like this - but contains over 70,000 sentences similar to this:
Sentence |
---|
You look good |
Love you ❤️ |
I am so happy today |
So far, I have tried this method:
import pandas as pd
import emoji
df['emojis'] = df['Sentence'].apply(lambda row: ''.join(c for c in row if c in emoji.UNICODE_EMOJI))
df
And this method:
def extract_emojis(text):
return ''.join(c for c in text if c in emoji.UNICODE_EMOJI)
df['emojis'] = df['Sentence'].apply(extract_emojis)
df
However, when I try them, my final output seems to be this:
Sentence | Emojis |
---|---|
You look good | |
Love you ❤️ | |
I am so happy today |
Hence, I want my output to look like this:
Sentence | Emojis |
---|---|
You look good | |
Love you | ❤️ |
I am so happy today |
As well as that, I have also tried this method, which is exactly what I want to do:
import pandas as pd
import emoji as emj
def extract_emoji(df):
df["emoji"] = ""
for index, row in df.iterrows():
for emoji in EMOJIS:
if emoji in row["Sentence"]:
row["Sentence"] = row["Sentence"].replace(emoji, "")
row["emoji"] += emoji
extract_emoji(df)
print(df.to_string())
Though, with the method above, the code does not seem to fully execute, and I think it cannot handle so many rows in the dataset; hence, I have over 70,000 sentences, which need the emojis extracting.
As you can see, I am nearly there, but not fully.
These three methods have not fully worked for me, and I require some additional help.
In summary, I just want to extract the emojis from each sentence and add them into a new column - if this is possible.
Thank you very much.