Let's first consider how it would be done with a lookbehind.
Then we just check if before what we capture is the start of the line, or a whitespace:
(?<=^|\s)(\.\d{5,})
We could simply change that lookbehind to a normal capture group.
Which means a preceding whitespace also gets captured. But in a replace we can just use or not use that capture group 1.
(^|\s)(\.\d{5,})
In the PCRE regex engine we have \K
\K : resets the starting point of the reported match. Any previously
consumed characters are no longer included in the final match
So by using that \K in the regex, the preceding space isn't included in the match
(?:^|\s)\K(\.\d{5,})
A test here
However, if you use Rubi's scan
with a regex that has capture groups?
Then it seems that it only outputs the capture groups (...)
, but not the non-capture groups (?:...)
or what's not in a capture group.
For example:
m = '.12345 .123456 NOT.1234567'.scan(/(?:^|\s)(\.\d{5,})/)
=> [[".12345"], [".123456"]]
m = 'ab123cd'.scan(/[a-z]+(\d+)(?:[a-z]+)/)
=> [["123"]]
So when you use scan, lookarounds don't need to be used.