This is very far from "obvious"; on the contrary. There is no direct way to say "don't match" for a complex pattern (there is good support at a character level, with [^a]
, \S
etc). Regex is firstly about matching things, not about not-matching them.
One approach is to match those (possibly nested) delimiters and get everything other than that.
A good tool for finding nested delimiters is the core module Text::Balanced. As it matches it can also give us the substring before the match and the rest of the string after the match.
use warnings;
use strict;
use feature 'say';
use Text::Balanced qw(extract_bracketed);
my $text = <<'END';
(bar foo bar)
bar foo (
bar foo
(bar) (foo)
)
END
my ($match, $before);
my $remainder = $text;
while (1) {
($match, $remainder, $before) = extract_bracketed($remainder, '(', '[^(]*');
print $before // $remainder;
last if not defined $match;
}
The extract_bracketed
returns the match, the remainder substring ($remainder
), and the substring before the match ($before
); so we keep matching in the remainder.
Taken from this post, where there are more details and another way, using Regexp::Common.