0

I'm trying to extract emojis from customer reviews into a separate column in python DataFrame. The code I'm using is:

def emojis(review): Content = review

for i in Content:
    if i in emoji.UNICODE_EMOJI:
        return i
    else:
        return 'NaN'

The problem is that it doesn't pick up all emojis from Content. Any ideas how to solve this issue?

  • Please refer to this link https://stackoverflow.com/questions/43146528/how-to-extract-all-the-emojis-from-text – k.avinash Feb 07 '20 at 11:26

1 Answers1

0

You need to use a list comprehension.

In this line:

for i in Content:
    if i in emoji.UNICODE_EMOJI:

It will always return False since you are checking the entire content if it is in emoji.UNICODE_EMOJI.

So what you need is something like:

emoji_list = []
dummy_list = []
for i in Content:
    dummy_list.extend([c for c in Content if c in emoji.UNICODE_EMOJI])
    if len(dummy_list) == 0:
        emoji_list.append(np.nan)
    else:
        emoji_list.append(dummy_list)
    dummy_list = []

You don't have a sample in your question so just take the answer above as how the logic goes.

We used lists because you want a column on a dataframe per entry/row of your content.

Simply use df['Emoji entry on that content'] = emoji_list

Joe
  • 879
  • 2
  • 6
  • 15