I am trying to extract strings between words. Consider this example -
x <- "There are 2.3 million species in the world"
This may also take another form which is
x <- "There are 2.3 billion species in the world"
I need the text between There
and either 'million
or billion
, including them. The presence of million or billion is decided on run time, it is not decided before hand. So the output which I need from this sentence is
[1] There are 2.3 million
OR
[2] There are 2.3 billion
I am using rm_between
function from qdapRegex
package for the same. Using this command I can extract only one of them at a time.
library(qdapRegex)
rm_between(x, 'There', 'million', extract=TRUE, include.markers = TRUE)
OR I have to use
rm_between(x, 'There', 'billion', extract=TRUE, include.markers = TRUE)
How can I write a command which can check presence of million
or billion
in the same sentence. Something like this -
rm_between(x, 'There', 'billion' || 'million', extract=TRUE, include.markers = TRUE)
I hope this is clear. Any help would be appreciated.