I am not looking for a regex to match phone numbers. This is simply my use case. I want to know why my regex isn't including an optional non-matching group within the capture.
To better illuminate my specific use case, a bit of an introduction. I am trying to match phone numbers. I have a working regex with the exception of when an extension is used.
My regex (a bit long, but comprehensive):
((?:\+{0,2}\d{1,3})?[-.()\/* ]*?\d{3}[-.()\/* ]*?\d{3}[-.()\/* ]*?\d{4}[-.()\/* ]*?(?:(?:x|ext)[:]?[ ]*\d+)?)
A shortened version to illustrate my issue:
(\d{4}[-.()\/* ]*?(?:(?:x|ext)[:]?[ ]*\d+)?)
Where:
(...)
is my capture group
\d{4}
four digits
[-.()\/* ]*?
various separators 0-infinite times (non-greedy)
(?:...)
non-capture group
x|ext
extension identifier
[:]?
":" 0-1 time
[ ]*
" " 0-infinite times
\d+
digit 1-infinite times
(?:...)?
non-capture group 0-1 time
So 1234 ext 567
should match, but only 1234
does
Regex101 link: regex101.com/r/NRQhTl/1
If I remove the ?
, to make the group not optional it works just fine:
(\d{4}[-.()\/* ]*?(?:(?:x|ext)[:]?[ ]*\d+))
It seems like the ?
is making it lazy but then also won't match numbers that do not have an extension.
Any help or insights would be greatly appreciated