0

I have the following regular expression:

first.*(?<!.*([;,\.]|and).*)second

I would like it to match the following:

first some word second

But not match the following:

first . some word second

first ; some word second

It is working but it is also excluding the following:

blah ; first some word second

I only want it to exclude matches if the negative look ahead falls in between the two words. It should not look behind the first word.

blhsing
  • 91,368
  • 6
  • 71
  • 106

2 Answers2

1

You can simply use:

first\b[^.;]*\bsecond
blhsing
  • 91,368
  • 6
  • 71
  • 106
0

First, you are not using a lookahead, in your pattern, you are using a lookbehind with an unknown width pattern inside. You want to match a string between a and b excluding c, and this is a common scenario for a tempered greedy token:

first(?:(?!and)[^;,.])*second

See the regex demo. Use word boundaries to match whole words: \bfirst\b(?:(?!\band\b)[^;,.])*\bsecond\b.

Details

  • first - a literal substring
  • (?:(?!and)[^;,.])* - zero or more chars other than ;, , and . that do not start an and char sequence
  • second - a literal substring.
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563