>>> src = ' pkg.subpkg.submod.thing pkg2.subpkg.submod.thing '
>>> re.search(r'\s*(\w+\.)+', src).groups()
('submod.',)
This regex seems to put everything which is not space into a/the group - nothing to be lost before stop of regex match.
Why is just the last "+" repetition found in the group here - and not ('pkg.subpkg.submod.',)
?
Or ('pkg.',)
- early stop because no real repetition - no "loss of information" in another sense?
(I needed to use another (?:...)
like r'\s((?:\w+\.)+)'
)
Even more strange:
>>> src = ' pkg.subpkg.submod.thing pkg2.subpkg.submod.thing '
>>> re.search(r'\s(\w+\.)*', src).groups()
(None,)
Edit: the "more strange" is actually "less strange" as @Avinash Raj pointed out, because - unlike intended - the match simply ends before the group; So
>>> re.search(r'\s+(\w+\.)*', ' pkg.subpkg.submod.thing').groups()
('submod.',)
.. then produces the same questioned behavior than "+" : just last repetition - things before seeming lost...