1

I have compiled a list of emoticons that I want to look for in a text. For example, the list of emoticons could be:

:)
:-(
):
:S
o_O
=D

And the text can be quite "difficult", that is, not all emoticons are separated by spaces:

text:S text=D. text :-(. text o_O text :)

How do I go about and replace these smilies with another string? I have tried to use some rather simple types go gsub()

emoticons <- c(":)",":-(","):",":S","o_O","=D")
texts <- "text:S text=D. text :-(. text o_O text :)"

for(x in 1:length(emoticons)) 
  texts2 <- gsub(emoticons[x], " XXX ", texts, fixed = TRUE)

But this doesn't go all the way, it only replaces some of the emoticons.

Joshua
  • 722
  • 12
  • 27
  • 1
    Possible duplicate of [Regex matching emoticons](http://stackoverflow.com/questions/28077049/regex-matching-emoticons) – Balakrishnan Nov 19 '15 at 19:17

1 Answers1

2

Try adding backslashes to your emoticon patterns to disable meta-character effects. Then paste the patterns together for the regex search:

emoticons <- c(":\\)",":-\\(","\\):",":S","o_O","=D")
gsub(paste0(emoticons, collapse="|"), " XXX ", texts)
#[1] "text XXX  text XXX . text  XXX . text  XXX  text  XXX "
Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • 1
    Thanks! This, in addition to the question below (which let me escape all special characters in my large list of emoticons) did the trick. http://stackoverflow.com/questions/14836754/is-there-an-r-function-to-escape-a-string-for-regex-characters – Joshua Nov 19 '15 at 20:47