I've been trying to match a phrase between hyphens. I realise that I can easily just split on the hyphen and get out the phrases but my equivalent regex for this is not working as expected and I want to understand why:
([^-,]+(?:(?: - )|$))+
[^-,]+
is just my definition of a phrase
(?: - )
is just the non capturing space delimited hyphen
so (?:(?: - )|$)
is capturing a hyphen or end of line
Finally, the whole thing surrounded in parentheses with a +
quantifier matches more than one.
What I get if I perform regex.match("A - B - C").groups()
is ('C',)
I've also tried the much simpler regex ([^,-]+)+
with similar results
I'm using re.match
because I wanted to use pandas.Series.str.extract
to apply this to a very long list.
To reiterate: I'm now using an easy split
on a hyphen but why isn't this regex returning multiple groups?
Thanks