1

I need to make sure that regex, that is passed as user input will not accidentally be terminated and turn into arbitrary Perl code, but at the same time work for basic filtering purposes.

Important! This part of the code is run in user-jailed mode, meaning that potentially, it can only be self-exploited. Apart from this, UI is only exposed to particular user, and potentially run against limited number of files, thus potential DoS risks are very minimal.

In order to reach my goal, I created custom function that would first quotemeta all, and later un-escape needed only for regex to run characters.

Example:

# Allow short range of special chars to be left unescaped
# to let regex work, while at the same time prevent possible
# command injection or premature regex termination
my $mask = $in{'mask'};
sub quotemeta_dangerous
{    
    my ($string) = @_;
    $string = quotemeta($string);
    $string =~ s/\\\\/\\/g;
    $string =~ s/\\\+/+/g;
    $string =~ s/\\\*/*/g;
    $string =~ s/\\\$/\$/g;
    $string =~ s/\\\^/\^/g;
    $string =~ s/\\\(/\(/g;
    $string =~ s/\\\)/\)/g;
    $string =~ s/\\\{/\{/g;
    $string =~ s/\\\}/\}/g;
    $string =~ s/\\\[/\[/g;
    $string =~ s/\\\]/\]/g;
    $string =~ s/\\\?/?/g;
    $string =~ s/\\\././g;
    $string =~ s/\\\-/-/g;
    return $string;
}

my $sanitized_mask = quotemeta_dangerous($mask);
if ($filename =~ /$sanitized_mask/) {
    # matched
}

Questions:

  1. Whether my solution above will help me to achieve my goals safely, considering mentioned, important side notes. What are the potential risks that I don't see here?

  2. As side, but familiar question, when further running substitutions, does the replace part can be injected/exploited as well, and if it is, how to safely run substitutions in contents on matched files?

Example:

$file_contents =~ s/\Q$text_to_find\E/$text_to_replace_with/g;

Is $text_to_replace_with can be avoided here as security risk, when passed from user as it is?

Ilia Ross
  • 13,086
  • 11
  • 53
  • 88

1 Answers1

3
  1. I'm not sure what you mean by terminated. As for running arbitrary Perl code, you can't do that from user input (unless the program enables it explicitly with e.g. eval() or use re 'eval'). If you could just inject Perl code from user input, your function wouldn't protect against it: It lets through e.g. (?{system+qq(rm -rf ~)}) in runnable form (runnable, that is, if it were part of the code, not input data).

    What you can do with a user input regex is create a DoS: Make it consume a lot of CPU and hang the program. Your function does not protect against that. For example, try:

    'aaaaaaaaaa' =~ /(((\1?[a-z]*)*)*)*[b-z]/
    

    Or with an even longer chain of a's. (There are probably ways to shorten this code; I was just throwing random bits together to see whether they finished matching quickly.)

    If you want to guard against that, have a look at RE2:

    RE2 was designed and implemented with an explicit goal of being able to handle regular expressions from untrusted users without risk.

    You can use it in your code by doing

    {
        use re::engine::RE2 -strict => 1;
        # now regexes compiled in this scope will use the RE2 engine
        ...
    }
    
  2. That's easy to answer. There's no danger here; $text_to_replace_with is simply treated as a string.

    (If you want to create danger, you need either

    • /e and eval(), or
    • /ee, which is the same thing.

    Technically you don't need /e, but that still leaves a very visible eval() in your code. Again, you can't attack this as a user; you have to code it in.)

melpomene
  • 84,125
  • 8
  • 85
  • 148
  • Thank you for your nice explanations. :) Thinking in terms of File Manager and searching for string, let's say, `a` in `/` (for `root` user) could already cause significant load on the system without using regex. However, due to environment and application design, it's acceptable as inevitable part of the app. – Ilia Ross May 01 '19 at 09:55
  • Yes, I am interested in protection against Perl code injection via regex. For this purpose, I don't allow literal spaces and backticks, which I think is enough, correct? – Ilia Ross May 01 '19 at 09:56
  • @IliaRostovtsev Perl automatically protects against that. You don't have to do anything. Spaces and backticks aren't special in regexes anyway. – melpomene May 01 '19 at 09:57
  • I thought that I could just terminate the regex with substitution, like `//` and go `; \`shutdown -h now\``. Are you sure about that you can not do it? Is my `quotemeta_dangerous` good in your opinion? – Ilia Ross May 01 '19 at 10:00
  • @IliaRostovtsev I am sure. That's why I wrote "*you can't do that from user input*" in my answer. I don't think your function is good because 1) it doesn't increase the security of your code (code injection is not possible and DoS is possible, both with and without your function), so 2) at best it provides a false sense of security, and 3) I find it hard to think about. It's a chain of string substitutions on code, which always make me feel uneasy. I don't trust it if it's not obviously correct. – melpomene May 01 '19 at 10:04
  • @IliaRostovtsev What about [this comment](https://stackoverflow.com/questions/2159355/how-can-i-safely-use-regexes-from-user-input/2159523#comment2104292_2159523): "*Fortunately, `(?{ code })` causes a compile time error if the regex includes variable interpolation unless you say `use re 'eval'` (for exactly this reason).*" – melpomene May 01 '19 at 10:12
  • ..and one more. From my example, `$text_to_find` has to be `quotemeta` using `\Q$text_to_find\E` or it's okay to go without it for matching and then replacing text in string? :) – Ilia Ross May 01 '19 at 10:25
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/192664/discussion-between-melpomene-and-ilia-rostovtsev). – melpomene May 01 '19 at 10:26