-1

I was hoping to construct a regular expression pattern based on the input like c("dont", "bias*") so that it could capture sentences that contain both words in order, and the two words shouldn't be more than 4 words apart. For example, it should capture like "I dont think hes biased", but it should not capture "I dont know if he has any bias", as the latter has 5 words between these two keywords.

I thought this pattern would work: \\bdont\\b.*(?:\\s+\\w+\\s+){0,4}?\\bbias\\w*\\b, but it returns TRUE for both sentences. Could anyone help me figure out what went wrong?

user6606453
  • 69
  • 10
  • 1
    Remove `.*` and place the second `\s+` outside of the non-capturing group: [`\bdont\b(?:\s+\w+){0,4}\s+\bbias\w*\b`](https://regex101.com/r/6KEdGC/1). – InSync Aug 30 '23 at 22:49
  • Possible duplicate: [Regex match two strings with given number of words in between strings](https://stackoverflow.com/q/67126157). The accepted answer has some Python-specific syntax though. – InSync Aug 30 '23 at 22:50
  • For the R patterns, it would be `'\\'` as escapes for b, w, s. – Chris Aug 30 '23 at 23:06

0 Answers0