2

Let me preface this by saying that I'm well aware that running user supplied code in a server environment is risky. Humour me - my question is specific to string evaluation and the subset of the language that can be executed in that context.

So I'm building a template generation system right now, and I want it to be fast. Super, super, super fast. This thing is going to get thrashed for mass email mailouts and that kind of thing. The approach I'm using is for users to supply user-entered template tags, which get turned into PHP variable substitutions via regexes before storing. Assuming my regexes are bulletproof, do you feel like the security of this process is acceptable?

  • template is input by user, with tags like [[contact.name]] and similar.
  • upon saving, regex transforms these into PHP variables, so the above wildcard becomes {$contact['name']} within the template string.
  • We also check for the presence of anything that could be transformed into an accessible variable from superglobal scope, so [[_SERVER]], [[GLOBALS]] etc as well as [[this are all disallowed and logged as hack attempts.
  • Other characters which have special meaning within a double quoted string ($, " and \) are escaped as well.
  • the generation process goes like this:
    • generation is a class method that gets run. The only variable passed in is $contact, which is an array.
    • the template string is read out into another local variable (in this case, $__templateString). Users could theoretically access this variable in their templates, but it doesn't really matter if they do - not a security risk, just dumb.
    • The code to generate the template is then simply eval('return "' . $__templateString . '";');

Any holes I'm missing here? I am pretty sure the only potential risks are matters of scope access, and I think I've covered all my bases there.

pospi
  • 3,540
  • 3
  • 27
  • 26
  • i don't know, but places like codepad do it. I have often wondered how. –  Mar 22 '12 at 02:01
  • @Dagon https://github.com/Viper-7/Deployable-PHP-Codepad chrooting and locked down system permissions for the user that the script is executed as. It's much more elaborate then `eval`. – deceze Mar 22 '12 at 06:24
  • If you want something fast and secure then you should use [handelbars](http://handlebarsjs.com/) and compile your PHP with [hiphop](https://github.com/facebook/hiphop-php/), we really don't need another smarty. – rook Mar 22 '12 at 06:51

2 Answers2

1

So what if I enter this template:

" . mysql_query('DROP TABLE users') . "

It's nice that you are guarding against possible access to variables you don't want people to access, but eval evaluates all code, not just variables. And try to find a regex to filter that out...

deceze
  • 510,633
  • 85
  • 743
  • 889
  • well that would be fine, cos I'm escaping double quotes out of the string before placing it back inbetween them. So all the above would do is output exactly as written into the template. – pospi Mar 22 '12 at 06:06
  • Ok, `{$(mysql_query()}}` then. [That's valid in strings.](http://www.php.net/manual/en/language.types.string.php#language.types.string.parsing) I guess the main point is that trying to lock down `eval` is an uphill battle. You'll have to be aware of *all* possible ways in which code can be executed (and there are a lot of them) and you have to try to lock each one down with a regex which will have exceptions to exceptions. You're only interested in a tiny aspect of `eval`, but need to fight every last one of the undesired side effects. That's not a sane way to develop. – deceze Mar 22 '12 at 06:14
  • ooh, nice catch. I wasn't aware that syntax was possible. But really, i know it's not a particularly *sane* way to go, but the page on string parsing should theoretically be the only information needed to get it secure (and I should have read all the way to the bottom before coming here, bad me). I know people always avoid this sort of thing on principle, but I feel like if you're a language veteran and confident enough to attempt it then why not? You just gotta be prepared to eat your words later when it turns out you shouldn't have been so cocky. – pospi Mar 22 '12 at 09:59
  • As a language veteran (;-)) my advice would be to avoid such uphill battle. It just means you have to write a lot of code about which you have to be very very confident just to lock `eval` down, when you could go the other way and write a little code which either works or it doesn't. And I bet you won't even see a major difference in performance. – deceze Mar 22 '12 at 11:15
  • fair call.. i'll test it against using simple str_replace and see how it goes. But i think I'll probably end up leaving it for now and head to C-land later on and implement it as a module before release. If you'd like to update your original answer with that function syntax and a link to the string parsing page I'll mark it as accepted/upvote (: – pospi Mar 25 '12 at 04:52
1

Anecdotal drivel: When I was a security contact for a Linux distribution, the PHP developers asked us to stop calling interpreter crashes on malformed input "security vulnerabilities". They were adamant that whoever supplied the scripts was 100% trusted, and I would fully expect eval() to be handled the same way.

You can try to patch over problems but I certainly wouldn't open up input to unlimited users. The chances of you overlooking an interpreter-crash bug is simply too high.

Further, consider deploying with a mandatory access control system such as AppArmor, SELinux, TOMOYO, or SMACK. This way you can restrict the potential damage from a hacked input to the minimal amount of resources necessary to do the work in the first place. (I've worked on AppArmor since 2000, so it would be my preferred choice for many environments. But consider the others, they are all high-quality products designed to solve different problems and one or another might be a better fit for your environment.)

sarnold
  • 102,305
  • 22
  • 181
  • 238