0

I planned to provide regex support in my service so my users can configure a regex and a String will be processed if it matches the regex.

Then I stumbled upon these articles:

OWASP ReDOS

blog.makensi.es

And found that a simple regex can be disastrous in my servers.

I need only basic matching abilities.

I'm planning to just strip parenthesis from the regex and if it is a valid regex, it'll be processed. I believe stripping parenthesis alone will be enough to save my servers from those attacks.

Am I right in this or missing anything?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Vigneshwaran
  • 3,265
  • 6
  • 23
  • 36
  • What, just yank all parentheses without even looking? If the parens are actually needed, and you remove them, the result may be syntactically valid, but it will be a different regex. And what will you do about parens that are escaped with backslashes, or in character classes, or both? Are you using a regex flavor that supports non-capturing groups, atomic groups, lookaheads, lookbehinds, branch-reset groups, conditionals...? – Alan Moore Mar 10 '16 at 15:39
  • I am trying to provide a pattern matching support. I thought I could simply use regex but found out about evil regex-es. I just wanted to know if not supporting grouping alone would prevent any evilness. But learnt from the accepted answer that it won't be enough. – Vigneshwaran Mar 11 '16 at 11:03

2 Answers2

1

Yes it would be a naive approach and would make your server susceptible to DOS attacks.

Actually first link you have given yourself is pretty nice and complimentary to it you could look at : How can I recognize an evil regex?

However it seems like it is a difficult task to detect such evil regexes. It depends how much risk do you want to take ? One solution could be building a mechanism to spawn processes/threads (depends on your platform) that evaluates input strings with the given regex and set a timeout for it. Once the process takes long (as long as you can afford) you could kill it.

Community
  • 1
  • 1
Ozgun Alan
  • 310
  • 3
  • 9
1

The risk you run is highly dependent on your specific regexp library. The classic "run-away" RE is (essentally) aa? repeated N times, matched against a string that is N a. This runs at approximately exponential time in PHP's, Python's and Perl's default libraries and roughly linear in time for Common Lisp's CL-PCRE (perl-compatible) and Go's regexp package (POSIX regular expressions).

Note that aa?aa?aa? has no parentheses.

Vatine
  • 20,782
  • 4
  • 54
  • 70
  • This answers my question. Also referred this question http://stackoverflow.com/questions/12841970/how-can-i-recognize-an-evil-regex?lq=1 and found that `a{0,1000}a{0,1000}` and `a*b*[ac]*$` are evil regex-es too even though they don't have parentheses. – Vigneshwaran Mar 11 '16 at 10:55