The point is the all optional subpatterns after a lazy dot pattern only match their patterns if this match happens right after one or zero chars.
That is, <i>(foo \d+\s*.+?)(\(bar\))?
will grab (bar)
if it follows 0 or more whitespaces and 1 char, like in <i>foo 42 <(bar)</i>
or <i>foo 42<(bar)</i>
(see demo).
Since you want to match up to any optional (bar)
, you need to make sure the .+?
is turned into a tempered greedy token that can be used with a greedy quantifier, but will be tempered, restricted with a negaitve lookahead:
<i>(foo \d+\s*(?:(?!\(bar\)).)*)(\(bar\))?
Or, if you need to match the closest foo <digits>
to the (bar)
:
<i>(foo \d+\s*(?:(?!\(bar\)|foo \d).)*)(\(bar\))?
See Regex 1 and Regex 2 demos.
Details
<i>
- literal string
(foo \d+\s*(?:(?!\(bar\)|foo \d).)*)
- Group 1:
foo \d+
- foo
, space and 1+ digits
\s*
- 0+ whitespaces
(?:(?!\(bar\)|foo \d).)*
- any char, 0 or more occurrences as many as possible, that does not start a (bar)
or foo
, space, a digit char sequences
(\(bar\))?
- an optional Group 2: (bar)
substring.