7

Say that I want to let a user input whichever regular expression he wants, and a string to match, and I will check whether it matches using Python's re.compile. Is that secure? Is there a way for a malicious user to crash or get remote execution by passing in specially-crafted strings?

Maroun
  • 94,125
  • 30
  • 188
  • 241
Ram Rachum
  • 84,019
  • 84
  • 236
  • 374
  • 2
    As a side-note, depending your needs, maybe worth considering a simple _glob_ expression, rather than full-fledged regex. For most user it is easier to understand. And it will require much less power to process. But again, it will depend on your needs... – Sylvain Leroux Aug 31 '14 at 11:09

1 Answers1

9

I don't think that re.compile() is going to be a problem. Of course it can throw an exception with invalid regexes, but you can easily catch those. Python regexes don't allow code callouts (unlike Perl, for example), so I don't see a mechanism that an attacker could use to inject malicious code into a regex.

Actually running the regex (via re.search() etc.) can be a problem, though, because Python doesn't take any precautions against catastrophic backtracking which may cause the regex's runtime to skyrocket.

It may be a good idea to run the regex in a dedicated process and kill that if it doesn't finish within a second or so.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561