1

I would like to change all rows with only emojis such as df['Comments'][2] to N/A.

df['Comments'][:6]
0                                                          nice
1                                                       Insane3
2                                                          ❤️
3                                                @bertelsen1986
4                       20 or 30 mm rise on the Renthal Fatbar?
5                                     Luckily I have one to 

The following code doesn't return the output I expect:

df['Comments'].replace(';', ':', '!', '*', np.NaN)

Expected Output:

df['Comments'][:6]
0                                                          nice
1                                                       Insane3
2                                                          nan
3                                                @bertelsen1986
4                       20 or 30 mm rise on the Renthal Fatbar?
5                                     Luckily I have one to 
Soumendra Mishra
  • 3,483
  • 1
  • 12
  • 38
Luc
  • 737
  • 1
  • 9
  • 22

2 Answers2

0

You can detect lines containing only emojis by iterating over the unicode characters in each line (using the emoji and unicodedata packages):

df = {}
df['Comments'] = ["Test", "Hello ", ""]

import unicodedata
import numpy as np
from emoji import UNICODE_EMOJI
for i in range(len(df['Comments'])):
    pure_emoji = True
    for unicode_char in unicodedata.normalize('NFC', df['Comments'][i]):
        if unicode_char not in UNICODE_EMOJI:
            pure_emoji = False
            break
    if pure_emoji:
        df['Comments'][i] = np.NaN
print(df['Comments'])
shredEngineer
  • 424
  • 4
  • 9
  • sorry but it doesn't have any effects on my code. Thanks for your effort! – Luc Aug 30 '20 at 11:18
  • You are right, it didn't write back the result correctly. I modified the example for correctly writing everything back into the ``df`` array and I verified that it works. Make sure to try it again now and consider marking my answer as solution as I posted it first. – shredEngineer Aug 30 '20 at 11:21
0

Function (remove_emoji) reference https://stackoverflow.com/a/61839832/6075699

Try
Install first emoji lib - pip install emoji

import re
import emoji

df.Comments.apply(lambda x: x if (re.sub(r'(:[!_\-\w]+:)', '', emoji.demojize(x)) != "") else np.nan)
0                         nice
1                      Insane3
2                          NaN
3               @bertelsen1986
4    Luckily I have one to 
Name: a, dtype: object
Dishin H Goyani
  • 7,195
  • 3
  • 26
  • 37