-2

How can i remove or separate the emoji (encoded in "utf-8") from a text as shown below

I mean,how can i remove "\xf0\x9f\x91\x8d\xf0\x9f\x8f\xbd" from my text

text="b'That new one I\xe2\x80\x99m Ikorodu is a masterpiece.Thanks for beautifying the landscape. \xf0\x9f\x91\x8d\xf0\x9f\x8f\xbdUnlike @jpoy that build banks like Prisons where human organs are harvested.'"

1 Answers1

0

One way to do that is to define what characters you want to remove, and then loop through them, and use the "replace" function.

text="b'That new one I\xe2\x80\x99m Ikorodu is a masterpiece.Thanks for beautifying the landscape. \xf0\x9f\x91\x8d\xf0\x9f\x8f\xbdUnlike @jpoy that build banks like Prisons where human organs are harvested.'"
bad_chars = ['I\xe2', '\x80', '\x99m', "\xf0"] 
for i in bad_chars : 
    text = text.replace(i, '') 
text 

Reference : https://www.geeksforgeeks.org/python-removing-unwanted-characters-from-string/

J.K
  • 1,178
  • 10
  • 13
  • Thanks but I have a csv file with a column text and I have alot of this bad character on each rows. This bad character was a result of encoding my emojis to "utf-8". Can I remove all emojis encoded in bytes from csv file? – Oluwatobi Shoyinka Aug 30 '19 at 17:24
  • If you can upload part of the original csv here ( as csv file), then I can help you better. – J.K Aug 30 '19 at 17:28
  • thanks brother,kindly drop your email address so i can send it to your mail. – Oluwatobi Shoyinka Aug 31 '19 at 16:31