I have a regexp:
import re
regexp = re.compile(r'^(?P<parts>(?:[\w-]+/?)+)/$')
It matches a string like foo/bar/baz/
and put the foo/bar/baz
in a group named parts
(the /?
combined with the /$
support this).
This works perfectly fine, until you match a string that doesn't end in a slash. Then, it gets slower at a seemingly-exponential rate with each new char you add to the string you're matching.
Example
# This is instant (trailing slash)
regexp.match('this-will-take-no-time-at-all/')
# This is slow
regexp.match('this-takes-about-5-seconds')
# This will not finish
regexp.match('this-probably-will-not-finish-until-the-day-star-turns-black')
I'm trying to understand why this specific recursion issue only happens when the /$
(trailing slash) isn't in the string (i.e., a non-match). Could you help me understand the control flow of the underlying algorithm in both the trailing slash and the non trailing slash cases?
Note
I'm not looking for a solution for my desired pattern. I'm trying to understand the specific regexp.