0

I am trying to extract the next set of numbers after the word "QUESTIONS". Example text:

x = c("GEOGRAPHY QUESTIONS TO 505 (708)", "MATHS QUESTIONS 606 (891)")

In both cases I would like to pull out the first set of numbers (the 505 and the 606) and ignore the numbers in brackets.

Link to regex101: https://regex101.com/r/0W53uM/1

Attempt to Solve

I can pull out the number using a combination of the following, but it is not very elegant.

str_extract(x, "QUESTIONS TO (\d+)")]

or

str_extract(x, "QUESTIONS (\d+)")]
Laurence_jj
  • 646
  • 1
  • 10
  • 23
  • this doesn't solve it as it only works in the first case and not the second – Laurence_jj Feb 16 '21 at 09:57
  • 1
    Sure, use the same technique with an alternation, `"(?:(?<=QUESTIONS TO )|(?<=QUESTION ))\\d+"` - [demo](https://regex101.com/r/mqwnJZ/1). `"(?<=QUESTIONS TO |QUESTION )\\d+"` might work, too. – Wiktor Stribiżew Feb 16 '21 at 09:58

0 Answers0