23

In the CodeIgniter PHP framework, there is a function that automatically runs on each request that, among other things, filters the GET/POST/COOKIE array keys, and kills the application if it encounters characters it deems unsafe.

To prevent malicious users from trying to exploit keys we make sure that keys are only named with alpha-numeric text and a few other items.

Something like:

// foreach GET/POST/COOKIE keys as $str...
if ( ! preg_match("/^[a-z0-9:_\/-]+$/i", $str))
{
    exit('Disallowed Key Characters.');
}

For example, this will trigger if you accidentally post something like <input name="TE$T"> or have a query string like ?name|first=1.

I can see this being a good way to enforce common sense key names, or catch mistakes while developing an application, but I don't understand: How can a malicious user possibly "exploit keys" in $_POST data for instance? Especially since (I would assume) input values are just as exploitable, what is this actually preventing?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Wesley Murch
  • 101,186
  • 37
  • 194
  • 228
  • Perhaps it is not a protection against a direct attack, but something that would make it much harder to exploit any other security vulnerabilities, like for example XSS or SQL injections. Just a thought... – d_inevitable May 11 '12 at 22:50
  • 12 upvotes in 17 minutes. You seem to have struck a chord :) – AlienWebguy May 11 '12 at 23:02
  • 2
    It's kinda weird that they let `/` pass. – Alix Axel May 11 '12 at 23:13
  • I just took a look at the code, after that snippet you have `if (UTF8_ENABLED === TRUE) { $str = $this->uni->clean_string($str); }` which makes a lot of sense I must say. – Alix Axel May 11 '12 at 23:18
  • 1
    Another weird thing is that they don't let `[]` pass... This is just dumb. – Alix Axel May 11 '12 at 23:28
  • @AlienWebguy i'm guessing they're just *making it rain* with their rep because its a friday...and they're at a party as we speak... all standing around with their phones in hand. lol – Kristian May 11 '12 at 23:31
  • @AlixAxel Why should they allow []? PHP automatically parses `` as an array, so allowing [] would only let people write `` which shouldn't be allowed anyway. – Shinhan Jun 27 '12 at 15:39
  • @Shinhan: Ah - yes, good point. – Alix Axel Jun 28 '12 at 00:52
  • 1
    fyi - the preg pattern they use doesn't properly validate for "alphanumeric". It will allow a trailing newline character lol. This is a VERY common mistake found in php code. [proof](http://codepad.viper-7.com/nkXMzT). They need to use the [D modifier](http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php) – goat Aug 08 '12 at 19:27

5 Answers5

18

You see this junk often in noob code:

$_SESSION = $_POST;

A seldom known secret is that $_SESSION is "special" and can't handle the pipe character, |, as a top level array key. php fails to save the session variables during shutdown/session_write_close, instead wiping the entire array.

session_start();

if (!isset($_SESSION['cnt'])) {
    $_SESSION['cnt'] = 0;
}

$_SESSION['cnt']++;

/* prior to php 5.4, it will never increment, because it never successfuly saves
unless you comment line below
*/
$_SESSION['a|b'] = 1;

print_r($_SESSION);

I'm not saying that's why CodeIgniter scrubs the keys, but it's one of many "ways input keys are exploitable".

Maybe a more likely reason though is because people certainly do stuff like this:

if ($formPostDidntValidate) {
    echo "Please fix the form submission errors and try again\n";
    foreach ($_POST as $key => $value) {
        echo "<p>$key<br>
              <input name='$key' value='$value'></p>";
    }
}

Outputting request variables without doing proper context-specific escaping, such as escaping for html contexts, html attribute contexts, or even sql contexts:

$sql = "select * from myTable where 1=1";
foreach ($_POST as $key => $value) {
    $sql .= " and $key = '$value'";
}

I've seen plenty of people escape the value, but not the key when building sql and/or html.

If you don't escape everything, you are easily hacked. Scrubbing values isn't as good as escaping, but it's much better than nothing, and considering how many developers aren't yet sophisticated enough to understand when and how to escape, I can see the attraction to adding automatic request variable scrubbing.

goat
  • 31,486
  • 7
  • 73
  • 96
  • BTW, where is that pipe behavior documented? – Alix Axel May 11 '12 at 23:42
  • 1
    Interesting, I didn't know that. I also learned today that PHP magically turns a bunch of characters in keys (including space characters) to underscores. So strange. In this case though, they don't seem to run the filter on session data (actually CI has their own session implementation and doesn't use PHP native sessions). I don't consider $_SESSION to be input, but if you do something like assign $_POST to it I guess you deserve your fate. – Wesley Murch May 11 '12 at 23:43
  • the reason for this, and how php modifies certain characters in request keys dates back to when register_globals was on by default. None of it seemed to be terribly consistent, but every language has dirty baggage it carries for backwards compatibility. – goat May 11 '12 at 23:45
  • @AlixAxel, I don't think it's documented, but there were bug reports on it many years back. – goat May 11 '12 at 23:46
  • 2
    For reference/curiosity, this comment on php.net is where I learned about the character replacement thing: http://www.php.net/manual/en/language.variables.external.php#81080 According to the poster: *"The full list of field-name characters that PHP converts to _ (underscore) is the following (not just dot): chr(32) ( ) (space) chr(46) (.) (dot) chr(91) ([) (open square bracket) chr(128) - chr(159) (various). PHP irreversibly modifies field names containing these characters in an attempt to maintain compatibility with the deprecated register_globals feature."* – Wesley Murch May 12 '12 at 00:02
  • @connec really? I tried on 5.4RC3 before posting this to make sure it still worked. – goat May 15 '12 at 21:56
  • @chris That's strange, definitely not happening on my RC1 (that is, the second `print_r` produces `Array ( [var] => 1 [a|b] => 1 )`) - must be some other cause? I just assumed they changed it with the removal of `register_globals`. – connec May 16 '12 at 06:41
9

Your question, in itself, brings up a good point: it's unclear what exactly you're being protected against. But there are some popular items it could be addressing:

  • magic quotes
  • things that could eventually lead to SQL injection
  • strings that could be executed by way of shell commands
  • things that could conflict with your URL
  • things that could conflict with HTML
  • things that resemble a directory traversal attack
  • cross site scripting (XSS) attacks

But other than those, I really can't think of why you'd always why you'd want to generally protect via preg_match("/^[a-z0-9:_\/-]+$/i", $str).

I've got the feeling that they're overprotecting simply because CodeIgniter is so widely used that they need to protect against things they themselves haven't thought of yet for the sake of their users who may be even less-aware of such attacks than CodeIgniter's developers.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Kristian
  • 21,204
  • 19
  • 101
  • 176
  • The OP is asking about the **keys** not the values. – Alix Axel May 11 '12 at 23:08
  • @AlixAxel the keys can still be used to generate SQL statements, shell commands, urls, html etc. – d_inevitable May 11 '12 at 23:10
  • Mind giving a simple example? – Alix Axel May 11 '12 at 23:11
  • If this is really all it's preventing, it's seems a bit overkill to *exit* the application. It's not even a config setting, it's hard-coded into the Codeigniter core. I've had a few (rare) instances where having pipe characters in keys were useful. There's always a way around, but I actually thought maybe CI was trying to prevent something specific, my hunch was cookie-related. – Wesley Murch May 11 '12 at 23:14
  • Ya, my point was that it really, really could be anything AND everything. – Kristian May 11 '12 at 23:15
  • @AlixAxel well how about this very stupid code: `foreach( $_GET as $key => value) mysql_query("INSERT INTO table SET $key = '$value'");` – d_inevitable May 11 '12 at 23:15
  • @d_inevitable you're right. AlixAxel, anything that can be output as a string can be read into something else that can be applied toward something. so, simply accessing an index in the POST array should be enough to demonstrate that fact. – Kristian May 11 '12 at 23:16
  • 1
    @d_inevitable: True, but that's not in any way limited to input keys, the values in that case are just as dangerous. Signs are piling up that this filtering isn't anything special, or designed to prevent anything specific. – Wesley Murch May 11 '12 at 23:19
  • Yeah I know, but you get the point, right? I could come up with someone realistic scenarios, that would be hard to fit in a comment. Actually just put a `mysql_real_escape_string` around the value. I've actually come across code like this a few times before where the coder remembered to escape the value, but not the field name. – d_inevitable May 11 '12 at 23:20
  • @d_inevitable: If your code looks like that, you have bigger problems that the array keys. If you use `$value` escape `$value`, if you use `$key`, escape `$key`. – Alix Axel May 11 '12 at 23:24
  • @AlixAxel It doesn't matter what my code looks like. This is about CI and what they were trying to do and Kristan's answer is very valid in that context. – d_inevitable May 11 '12 at 23:25
  • @Kristian: I disagree, that example is just forced. If you only escape `$value` and trust that the regex will do its thing with `$key` there're still plenty of ways to mess up with application / DB. – Alix Axel May 11 '12 at 23:25
  • 1
    @AlixAxel i agree with you, but we're trying to address why we think the CI devs did it -- it was probably to cover their arses...probably... maybe – Kristian May 11 '12 at 23:27
  • @AlixAxel I think the point that you are missing is that this is not some sort of utility for the user of CI. Instead it is a security precaution such that if the user of CI messes up, there is still something in place to make it harder for an attacker to do a SQL injection etc. – d_inevitable May 11 '12 at 23:28
  • @d_inevitable: ...or use multi-part forms / query strings for instance. There's just one way to do it, and this is not the right one. I agree with some of the comments here, but I don't agree with the answer. It doesn't address any of those attack vectors, at least correctly anyway. – Alix Axel May 11 '12 at 23:32
  • 2
    I guess this could be just as well `unset`ing the variable or something, and they kill the program because they *assume* it's malicious. This answer works for me, it seems like this function isn't designed to prevent any specific type of attack, or one related to input keys specifically. @AlixAxel: If you don't agree, what is your counterpoint? – Wesley Murch May 11 '12 at 23:33
  • @WesleyMurch: This function seems stupid to me, if you notice they even try to clean UTF-8 chars **after exiting** if any other character besides the ASCII ones `a-z0-9:_\/-` exist... How logical is that? – Alix Axel May 11 '12 at 23:35
  • @WesleyMurch: My counterpoint is the same as yours: "it seems like this function isn't designed to prevent any specific type of attack" while the answer states that "some popular items (attacks) it could be addressing". – Alix Axel May 11 '12 at 23:37
  • @AlixAxel if that did completely eliminate these kind of attacks, then wow, id wonder why we got through all the hassle of escaping string etc. This is just a way of reducing the probability of a security vulnerability at the expense of flexibility. I would also opt against this feature, but the question is not whether we like it, but what it may be good for. And it cannot be argued that in very certain badly implemented applications it could not prevent a particular vulnerability. – d_inevitable May 11 '12 at 23:37
  • 1
    @AlixAxel: I had to accept one of the answers that said "it's pointless" and I thought this was the best one, regardless of the irrelevant bullet points. I agree, it's stupid - I thought there was a specific reason for it. – Wesley Murch May 11 '12 at 23:38
  • @d_inevitable: Ok, but the fact that it exists, gives poor programmers (the same ones that this function is for) the false sense of security that all the keys are white-listed, while at the same time restricting the correct usage of them. – Alix Axel May 11 '12 at 23:40
  • @AlixAxel lets don't tell the poor programmers about this feature, shall we? :D – d_inevitable May 11 '12 at 23:41
  • @AlixAxel: I think you have it backwards, they do the UTF8 conversion *after* the string has passed inspection - no point in doing it any earlier is there?. – Wesley Murch May 12 '12 at 02:14
  • @WesleyMurch: But how can the string have bad UTF-8 chars if the only characters allowed is that limited ASCII set? – Alix Axel May 12 '12 at 03:21
  • @AlixAxel: LOL you are totally right, I wasn't even thinking of that - good point. I can't make any sense of it, I guess it's just more useless paranoia. – Wesley Murch May 12 '12 at 03:44
4

Say in a console I change your form's field name from name="email" to name="email\"); DROP TABLE users;

It's not likely to be a successful XSS attack but I can see something like that doing damage to poorly coded PHP. CI is probably just trying to cover all their bases so they can have a claim to be as XSS-protected as possible.

AlienWebguy
  • 76,997
  • 17
  • 122
  • 145
  • So like everyone seems to be saying, you think it's just a security blanket for people who, I guess, implicitly trust input keys and do dangerous things with them without escaping or filtering? – Wesley Murch May 11 '12 at 23:09
  • Ya basically - consider what a 2012 Douglas Crockford fanboy could do with a 2003 PHP4 site. CI probably considered that and said, "just in case..." – AlienWebguy May 11 '12 at 23:18
3

That kind of check is a waste of time. You only access keys that you expect anyway - and if for some reason you iterate over all elements you will most likely escape them properly for whatever you are going to do with them.

However, looking at the skills of an average newbie PHP programmer (even though members of this species most likely don't use a framework at all) it makes some sense, you can never know what nasty things he's going to do with the code which might even kill his cat without such a check.

The same thing applies to rejecting posts containing e.g. "delete from" as a anti-SQL-injection measure. It can easily cause false positives and if you e.g. use parametrized queries you are safe anyway.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
ThiefMaster
  • 310,957
  • 84
  • 592
  • 636
  • "You only access keys that you expect anyway" -- the "you" in that sentence doesn't necessarily apply to those who are trying to screw with your code maliciously. The point is that the collective knowledge of a bunch of *hackers* is larger than that of a single developer. better to over-protect than to under-protect, don't you think? – Kristian May 11 '12 at 23:09
  • +1, I believe this is the right answer. @Kristian: Actually I agree with ThiefMaster, this just adds a false sense of security which might do more harm than good IMO. As a rule, **all data must be sanitized**, might that be a key or a value. Also, I'm pretty sure XSS is still possible with that "security" mechanism. – Alix Axel May 11 '12 at 23:22
  • @Kristian: Personally I think that over-protection can lead to the developer's ignorance and implicit trust of data that is *probably* "safe" according to the framework. No amount of safety nets can stop a developer from doing something stupid, I guess that's all this is. – Wesley Murch May 11 '12 at 23:23
  • "That kind of check is a waste of time." + "it makes some sense, you can never know what nasty things he's going to do with the code which might even kill his cat without such a check." = flip flop. – Maverick May 11 '12 at 23:25
  • @WesleyMurch you're absolutely correct about your point regarding the trust / ignorance of the developer. To that I will say that anyone who uses a framework for something in a language they have just begun to learn *before* building one themselves is technically at risk of misunderstanding where the language ends and the framework begins. Its a difficult balance to strike without reinventing the wheel. – Kristian May 11 '12 at 23:26
2

Perhaps it is trying to prevent this attack.

The attack works by using knowledge of how PHP builds its hashing structures to make keys in $_POST which take an arbitrarily long time to process.

I suspect it is probably just trying to prevent the more mundane SQL injection attacks though.

Nick Craig-Wood
  • 52,955
  • 12
  • 126
  • 132
  • 1
    Ah I've heard of that before sort of, but what would the value of the key have to contain to actually invoke that bug? Couldn't a collision be caused with *any* key value? (sorry, the details are a bit over my head) In other words, would that method of input key filtering actually prevent that specific attack? – Wesley Murch May 11 '12 at 23:04
  • I'm not 100% sure as I haven't studies the PHP hashing algorithm in depth. It would seem likely to me that restricting the character set would mitigate the exploit though. The exploit relies on making strings which all hash to the same value and you'll have a lot less choice making those strings from a more limited character set. – Nick Craig-Wood May 12 '12 at 08:11