0

A few times I saw regex experts say that using (.|\n)*? is a really, really bad idea.

Well, I do understand that it's better to replace it with the .* and use the /s flag. But sometimes the flags are not available, for example, when using regex within a text editor or other software with limited regex functionality. Thus, using something like (.|\n)*? might be the only option for multi-line matching.

So, what are the reasons to always avoid (.|\n)*??

Ildar Akhmetov
  • 1,331
  • 13
  • 22
  • 1
    Use `[\s\S]` if flags are not available. Or inline modifiers like `(?s)`. You are almost never forced to use an inefficient `(.|\n)*?` that will kill the performance with excessive backtracking. – Wiktor Stribiżew Apr 16 '19 at 08:19
  • So performance is the main reason to avoid `(.|\n)*?`, right? – Ildar Akhmetov Apr 16 '19 at 08:21
  • @Wiktor The linked duplicate does not answer the OPs question. *Why* is it a bad idea? – Tomalak Apr 16 '19 at 08:22
  • @Tomalak That solves the problem (if there is any). I will find one that is better. The answer is: excessive backtracking. Actually, it is one of the "what does this mean" questions. – Wiktor Stribiżew Apr 16 '19 at 08:23
  • 1
    @IldarAkhmetov Compared to `[\s\S]` (or `[\d\D]` or any such x|not-X construct), the `(.|\n)*` is much more wasteful. It's a capturing group, so it needlessly remembers what is being matched, it builds the necessary structures for backtracking that are never really being used, and it matches only one character at a time. This variant is more efficient: `(?:.*\n?)*` but it's way more to write and remember, and `[\s\S]` *still* beats it significantly. Compare how many steps each take on https://regex101.com/ – Tomalak Apr 16 '19 at 08:32
  • @Wiktor I'm not sure if "excessive backtracking" is happening. The regex engine goes along the string character-by-character and never really backtracks, as the alternatives in the group always provide a way forward. But it builds everything it needs if it were ever forced to backtrack, which is wasteful. – Tomalak Apr 16 '19 at 08:38
  • @Tomalak It does backtrack, be the quantifier greedy or non-greedy, that is backtracking. – Wiktor Stribiżew Apr 16 '19 at 08:41
  • If something follows the `(.|\n)*`, it will. Otherwise I would assume it simply keeps alternating until it's at the end of of the string? – Tomalak Apr 16 '19 at 08:49
  • @Tomalak Thanks, it clearly answers my question! Compared the number of steps for the options, got it now. – Ildar Akhmetov Apr 16 '19 at 08:50
  • @Ildar The second duplicate explains the issue exhaustively, if you haven't seen it being added. – Tomalak Apr 16 '19 at 09:45
  • @Tomalak yes, read and upvoted it already :) Thanks loads for helping to find out! – Ildar Akhmetov Apr 16 '19 at 09:48

0 Answers0