First off – @Tranbi is correct. You should not use regular expressions for this. It is much easier to understand and maintain using the method that they have provided.
With that disclaimer out of the way – it is possible to do this using the pattern matching provided by modern extensions to PCRE and company, available in the regex
module (which is not in the standard library so you'd need to install it).
The technique in the linked post for matching balanced brackets gets you part of the way, but doesn't cover the fact that you're actually trying to match the parts of the string outside the brackets. This requires some verb trickery:
import regex as re
input_str = 'gotcha & symbol [but not that & one [even that & one] and not this & one] but this is & ok'
for match in re.finditer(r'&|(\[(?:[^\[\]]+|(?1))*\])(*SKIP)(*FAIL)', input_str):
print(f"Found symbol {match.group(0)} at position {match.span()}")
Output:
Found symbol & at position (7, 8)
Found symbol & at position (87, 88)
We can unpack the pattern a bit:
r'''(?x)
& # The pattern that we're looking for - just the ampersand
| # ... or ...
( # Capturing group #1, which matches a balanced bracket group
\[ # which consists of a square bracket ...
(?:
[^\[\]]+ # ... followed by any number of non-bracket characters ...
| (?1) # ... or a balanced bracket group (i.e. recurse, to match group #1) ...
)*
\] # ... and then the matching end bracket.
) # End of capturing group #1.
# BUT we don't want to match anything between brackets, so ...
(*SKIP) # ... instruct the regex engine to discard what we matched ...
(*FAIL) # ... and tell the engine that it wasn't really a match anyway.
'''
So there you go! Fancy patterns for pattern matching abuse. Once again – a little hand-rolled parser is by far the better way to solve this problem, but it is fun to check in every few years on what you can do with the regex
module.