1

I've memorized and use the following pattern whenever I have to use escape characters, such as with a file path or a string with escapes:

http://www.softec.lu/site/RegularExpressions/UnrollingTheLoop

And the pattern is:

normal* ( special normal* )*

Or, if there is a known start/end (such as a string having quotes on either side):

start normal* ( special normal* )* end

For a very basic example, let's say I want to capture the string '<string>', which can contain escapes. Using the pattern I have:

start   = '
normal  = [^'\\]    (anything except a quote or escape)
special = \\.       (an escape and then any character)
end     = '

And doing the substitutions and doing some trivial change for capturing groups I have:

My question is why that cannot just be shortened to:

start (normal | special )* end

For example:

It seems much less repetitious to implement. What advantages if any does the first technique have over this simplified way?

David542
  • 104,438
  • 178
  • 489
  • 842
  • 2
    59 vs 21 steps hint at the answer: the version with alternation involves many more backtracking steps. Cf. [regex debugger #1](https://regex101.com/r/pb182e/1/debugger) and [regex debugger #2](https://regex101.com/r/LGP0Iz/2/debugger) – Wiktor Stribiżew Apr 18 '21 at 21:11
  • @WiktorStribiżew I see. Want to post an example with a bit more details and hopefully this can be helpful for people in the future that are trying to use this pattern (often of the simplified version) and why the longer version is much more efficient than the other one? – David542 Apr 18 '21 at 21:12
  • I do not think the title is right, you are not asking how to optimize the unroll-the-loop pattern, but about its advantages over non-unrolled patterns. – Wiktor Stribiżew Apr 18 '21 at 21:15
  • @WiktorStribiżew yes that's correct: please feel free to edit it if you think that would make it more clear. – David542 Apr 18 '21 at 21:15
  • I think it is a dupe of [Regular Expression composition](https://stackoverflow.com/questions/39137509/regular-expression-composition/39142688#39142688) – Wiktor Stribiżew Apr 18 '21 at 21:17
  • @WiktorStribiżew sure, but that title is even worse! Maybe it can be edited to make it more easy to find it. – David542 Apr 19 '21 at 04:10
  • 1
    Done. Still, this question can be a good signpost that will redirect traffic to that post. – Wiktor Stribiżew Apr 19 '21 at 08:23

0 Answers0