-1

I am trying to remove emojis from customers' reviews data in R. Emojis appear in this format <U+0001F603>.

For example, this is how a review appears in the dataset: "It's mind-blowing! <U+0001F603>" And I want to remove the <U+0001F603>.

I have tried gsub and iconv but did not work.

I really appreciate any help you can provide.

SirMe
  • 1
  • 2
  • Does this answer your question? [remove emoji from string in R](https://stackoverflow.com/questions/38215590/remove-emoji-from-string-in-r) – Juan Bosco Jun 02 '22 at 18:27

1 Answers1

0

It depends a bit on how exactly your strings look like.

In your case, using plain regex may work. Replacing the emoji with a space may be preferable than just removing it, otherwise you risk ending up with two words merged into one.

stringr::str_replace_all(string = "It's mind-blowing! <U+0001F603>",
                         pattern = '<U.*>',
                         replacement = " ")

you may want to add stringr::str_squish() to drop redundant spaces.

giocomai
  • 3,043
  • 21
  • 24