Note: I know none of this is supported in the existing re
module, I am using the newer regex
module intended to replace re
in the future.
I need to build some complex regular expressions, but I would also like those expressions to be maintainable. I don't want anyone to come back to this code months later and have to spend days unravelling or re-writing the expression, myself included. :P
There is some PCRE syntax that I've previously used to accomplish this, eg:
/
(?(DEFINE)
(?<userpart> thomas | richard | harold )
(?<domainpart> gmail | yahoo | hotmail )
(?<tld> com | net | co\.uk )
(?<email> (?&userpart)@(?&domainpart)\.(?&tld) )
)
^ To: \s+ .* \s+ < (?&email) > $
/ix
Will match the line: To: Tom Selleck <thomas@gmail.com>
Note²: I'm not trying to match email addresses, it's just an example.
I see that the regex
module has implemented recursive patterns, and named recursive patterns, but it does not seem to like (?(DEFINE) ... )
syntax, giving the error unknown group at position 10
.
Is it at all possible to pre-define named patterns like this in Python?