23

I'm currently improving my knowledge about security holes in HTML, PHP, JavaScript etc. A few hours ago, I stumbled across the /e modifier in regular expressions and I still don't get how it works. I've taken a look at the documentation, but that didn't really help.

What I understood is that this modifier can be manipulated to give someone the opportunity to execute PHP code in (for example, preg_replace()). I've seen the following example describing a security hole but it wasn't explained, so could someone please explain me how to call phpinfo() in the following code?

$input = htmlentities("");
if (strpos($input, 'bla'))
{
   echo preg_replace("/" .$input ."/", $input ."<img src='".$input.".png'>", "bla");
}
Yami
  • 1,405
  • 1
  • 11
  • 16
  • 3
    http://php.net/manual/en/reference.pcre.pattern.modifiers.php – Fabio Jun 07 '13 at 14:18
  • 7
    The `e` modifier is deprecated as of PHP 5.5. It is recommend to use [`preg_replace_callback`](http://www.php.net/manual/en/function.preg-replace-callback.php) instead. – Felix Kling Jun 07 '13 at 14:19
  • 2
    I think the OP should have accepted [Sweetie Belle's](http://stackoverflow.com/a/16986549/1438393) answer instead. – Amal Murali Aug 23 '13 at 15:05

4 Answers4

54

The e Regex Modifier in PHP with example vulnerability & alternatives

What e does, with an example...

The e modifier is a deprecated regex modifier which allows you to use PHP code within your regular expression. This means that whatever you parse in will be evaluated as a part of your program.

For example, we can use something like this:

$input = "Bet you want a BMW.";
echo preg_replace("/([a-z]*)/e", "strtoupper('\\1')", $input);

This will output BET YOU WANT A BMW.

Without the e modifier, we get this very different output:

strtoupper('')Bstrtoupper('et')strtoupper('') strtoupper('you')strtoupper('') strtoupper('want')strtoupper('') strtoupper('a')strtoupper('') strtoupper('')Bstrtoupper('')Mstrtoupper('')Wstrtoupper('').strtoupper('')

Potential security issues with e...

The e modifier is deprecated for security reasons. Here's an example of an issue you can run into very easily with e:

$password = 'secret';
...
$input = $_GET['input'];
echo preg_replace('|^(.*)$|e', '"\1"', $input);

If I submit my input as "$password", the output to this function will be secret. It's very easy, therefore, for me to access session variables, all variables being used on the back-end and even take deeper levels of control over your application (eval('cat /etc/passwd');?) through this simple piece of poorly written code.

Like the similarly deprecated mysql libraries, this doesn't mean that you cannot write code which is not subject to vulnerability using e, just that it's more difficult to do so.

What you should use instead...

You should use preg_replace_callback in nearly all places you would consider using the e modifier. The code is definitely not as brief in this case but don't let that fool you -- it's twice as fast:

$input = "Bet you want a BMW.";
echo preg_replace_callback(
    "/([a-z]*)/",
    function($matches){
        foreach($matches as $match){
            return strtoupper($match);
        }
    }, 
    $input
);

On performance, there's no reason to use e...

Unlike the mysql libraries (which were also deprecated for security purposes), e is not quicker than its alternatives for most operations. For the example given, it's twice as slow: preg_replace_callback (0.14 sec for 50,000 operations) vs e modifier (0.32 sec for 50,000 operations)

Mech
  • 3,952
  • 2
  • 14
  • 25
Glitch Desire
  • 14,632
  • 7
  • 43
  • 55
  • Your example does not include nor does it require the `e` modifier. – Felix Kling Jun 07 '13 at 14:39
  • 2
    Fixed, not sure why I went off on that tangent. – Glitch Desire Jun 07 '13 at 14:41
  • 2
    +1, for providing a good clear example. This also demonstrates why it's so bad for performance -- 14 `eval()` calls in a single line of code. Ouch. `eval()` is slow enough as it is already. – Spudley Jun 07 '13 at 14:45
  • 2
    @Spudley: Performance comparison of [preg_replace_callback](https://eval.in/33014) (0.14 sec for 50,000 operations) vs [e modifier](https://eval.in/33015) (0.32 sec for 50,000 operations) – Glitch Desire Jun 07 '13 at 14:48
6

The e modifier is a PHP-specific modifier that triggers PHP to run the resulting string as PHP code. It is basically a eval() wrapped inside a regex engine.

eval() on its own is considered a security risk and a performance problem; wrapping it inside a regex amplifies both those issues significantly.

It is therefore considered bad practice, and is being formally deprecated as of the soon-to-be-released PHP v5.5.

PHP has provided for several versions now an alternative solution in the form of preg_replace_callback(), which uses callback functions instead of using eval(). This is the recommended method of doing this kind of thing.

With specific regard to the code you've quoted:

I don't see an e modifier in the sample code you've given in the question. It has a slash at each end as the regex delimiter; the e would have to be outside of that, and it isn't. Therefore I don't think the code you've quoted is likely to be directly vulnerable to having an e modifier injected into it.

However, if $input contains any / characters, it will be vulnerable to being entirely broken (ie throwing an error due to invalid regex). The same would apply if it had anything else that made it an invalid regular expression.

Because of this, it is a bad idea to use an unvalidated user input string as part of a regex pattern - even if you are sure that it can't be hacked to use the e modifier, there's plenty of other mischief that could be achieved with it.

Spudley
  • 166,037
  • 39
  • 233
  • 307
  • 1
    "I don't see an e modifier in the sample code you've given.." Yes, the e modifier should be injected with the $input variable. Thats where I stuck. How to manipilate this input to inject an e modifier. – Yami Jun 07 '13 at 14:50
  • 1
    @Yami - yeah, it looks like I'm the only one who actually tried to address that part of your question :-) As I said, I honestly can't see a way to get the `e` modifier into it, since the modifier would have to be outside of the delimiters, and you've got the delimiters fixed in place. If you hadn't hard-coded the delimiters and were expecting `$input` to include them, then yes, that would be easily exploitable. As I say, even with the delimiters specified, it's still easy for a hacker to cause errors here, but I don't think he can get the `e` modifier into play. – Spudley Jun 07 '13 at 15:32
3

As explained in the manual, the /e modifier actually evaluates the text the regular expression works on as PHP code. The example given in the manual is:

$html = preg_replace(
    '(<h([1-6])>(.*?)</h\1>)e',
    '"<h$1>" . strtoupper("$2") . "</h$1>"',
    $html
);

This matches any "<hX>XXXXX</hX>" text (i.e. headline HTML tags), replaces this text with "<hX>" . strtoupper("XXXXXX") . "<hX>", then executes "<hX>" . strtoupper("XXXXXX") . "<hX>" as PHP code, then puts the result back into the string.

If you run this on arbitrary user input, any user has a chance to slip something in which will actually be evaluated as PHP code. If he does it correctly, the user can use this opportunity to execute any code he wants to. In the above example, imagine if in the second step the text would be "<hX>" . strtoupper("" . shell('rm -rf /') . "") . "<hX>".

deceze
  • 510,633
  • 85
  • 743
  • 889
2

It's evil, that's all you need to know :p

More specifically, it generates the replacement string as normal, but then runs it through eval.

You should use preg_replace_callback instead.

Niet the Dark Absol
  • 320,036
  • 81
  • 464
  • 592