0

I would like to remove the word "amp" in the below sentence.

original:

x <- 'come on ***amp*** this just encourages the already rampant mispronunciation of phuket'

What I want:

x <- 'come on this just encourages the already rampant mispronunciation of phuket'

However, if I used gsub, the "amp" in the word of "rampant" will be removed as well which is NOT the case I want. Can I know what function should I use in this case?

> gsub("amp","", x)
[1] "come on  this just encourages the already rant mispronunciation of phuket"
Artem Klevtsov
  • 9,193
  • 6
  • 52
  • 57
user3456230
  • 217
  • 4
  • 13

3 Answers3

1

You can use this regex:

gsub("\\bamp\\b","", x)
# [1] "come on  this just encourages the already rampant mispronunciation of phuket"

The \\b means word boundary.

Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
  • +1, however matching with `\b` like this might introduce problems in some cases. See the comments to this post: http://stackoverflow.com/questions/5752829/regular-expression-for-exact-match-of-a-word/13065780#13065780. – Paul Hiemstra Apr 22 '14 at 06:36
1

You could also split the string into words, and then compare:

x <- 'come on this just encourages the already rampant mispronunciation of phuket'
split_into_words = strsplit(x, ' ')[[1]]
filtered_words = split_into_words[!split_into_words == 'amp']
paste(filtered_words, collapse = ' ')
[1] "come on this just encourages the already rampant mispronunciation of phuket"
Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
0

You could just find the occurrence of "amp" that has a space in front.

> gsub("\\samp", "", x)
## [1] "come on this just encourages the already rampant mispronunciation of phuket"

where \\s means space. This is more readable as

> gsub(" amp", "", x)
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245