While in most cases, I would go with stringr
package as already suggested in CPak's answer, there is also i grep solution to this:
# create the sample string
c <- ("She sold seashells by the seashore, and she had a great time while doing so.")
# match any sold and great string within the text
# ignore case so that Sold and Great are also matched
grep("(sold.*great|great.*sold)", c, value = TRUE, ignore.case = TRUE)
Hmm, not bad, right? But what if there was a word merely containing the phrase sold
or great
?
# set up alternative string
d <- ("She saw soldier eating seashells by the seashore, and she had a great time while doing so.")
# even soldier is matched here:
grep("(sold.*great|great.*sold)", d, value = TRUE, ignore.case = TRUE)
So you might want to use word boundaries, i.e. match the entire word:
# \\b is a special character which matches word endings
grep("(\\bsold\\b.*\\bgreat\\b|\\bgreat\\b.*\\bsold\\b)", d, value = TRUE, ignore.case = TRUE)
the \\b
matches first character in the string, last character in the string or between two characters where one belongs to a word and the other does not:
More on the \b
metacharacter here:
http://www.regular-expressions.info/wordboundaries.html