11

A co-worker today made a bet with me that he knows of a way to supply a specially formatted string that could pass the following regex check and still supply a file name with extension .php or .jsp or .asp:

if (preg_match('/\.(jpeg|jpg|gif|png|bmp|jpe)$/i', $var) && preg_match('/\.(asp|jsp|php)$/i', $var) == false) 
{
    echo "No way you have extension .php or .jsp or .asp after this check.";
}

As hard as I tried myself and searched the net, I was unable to find a flaw that would make such thing possible. Could I be overlooking something? Given that "null byte" vulnerability is dealt with, what else might be the issue here?

Note: In no way am I implying that this code is a full-proof method of checking the file extension, there might be a flaw in preg_match() function or the file contents could be of different format, I just ask the question in terms of regex syntax itself.

EDIT - actual code:

if (isset($_FILES["image"]) && $_FILES["image"]["name"] && preg_match('/\.(jpeg|jpg|gif|png|bmp|jpe)$/i', $_FILES["image"]["name"]) && preg_match('/\.(asp|jsp|php)$/i', $_FILES["image"]["name"]) == false) {
    $time = time();
    $imgname = $time . "_" . $_FILES["image"]["name"];
    $dest = "../uploads/images/";

    if (file_exists($dest) == false) {
        mkdir($dest);
    }

    copy($_FILES['image']['tmp_name'], $dest . $imgname);
    
}else{
    echo "Invalid image file";
}
    

PHP version: 5.3.29

EDIT: epilogue

Turned out the 'vulnerability' only presents itself on Windows. Nevertheless, it did exactly what my coworker told me it would - passed the regex check and saved the file with executable extension. Following was tested on WampServer 2.2 with PHP 5.3.13:

Passing the following string to the regex check above test.php:.jpg (note the ":" colon symbol at the end of desired extension) will validate it and the function copy() seems to omit everything after the colon symbol including the symbol itself. Again, this is only true for windows. On linux the file will be written exactly with the same name as passed to the function.

mickmackusa
  • 43,625
  • 12
  • 83
  • 136
astralmaster
  • 2,344
  • 11
  • 50
  • 84
  • For better illustration, this is not an actual production code – astralmaster Nov 17 '15 at 16:40
  • Why ask us? If your coworker really knows of a way, let him speak up :P – Jonnix Nov 17 '15 at 16:40
  • It's a bet :) I'm not willing to give up 20 bucks.. yet. – astralmaster Nov 17 '15 at 16:41
  • `preg_match` returns `1` when it matches, `0` if it doesn't, and `false` if there is an error – Sean Bright Nov 17 '15 at 16:41
  • Yep, conversion happening there – astralmaster Nov 17 '15 at 16:42
  • @SeanBright: indeed but it's not a problem since he uses a non strict comparison `==`. – Casimir et Hippolyte Nov 17 '15 at 16:43
  • @CasimiretHippolyte presisely – astralmaster Nov 17 '15 at 16:43
  • Relying on PHP's implicit conversions is always a problem – Sean Bright Nov 17 '15 at 16:43
  • 1
    @SeanBright True. Anything about the issue itself? – astralmaster Nov 17 '15 at 16:44
  • Yes, I voted to close as off-topic. – Sean Bright Nov 17 '15 at 16:44
  • You just check, if the fileextension is on the end (...$) of the string $var, maybe the "special" in his string is, that the keywords are not at the end. – Paladin Nov 17 '15 at 16:44
  • @Paladin76 No, I asked him that. It would violate the terms of the bet. :) – astralmaster Nov 17 '15 at 16:45
  • Ah, would this fail if the filename contained a newline? – Jonnix Nov 17 '15 at 16:45
  • @JonStirling Tried that, still wouldn't pass. – astralmaster Nov 17 '15 at 16:45
  • @astralmaster Bah. I wanna know! – Jonnix Nov 17 '15 at 16:46
  • @josnidin's answer to this question might be useful - I look forward to seeing the outcome. His answer - it has a link to further discussion. http://stackoverflow.com/questions/14867475/php-regex-find-vulnerability-within-email-validation-pattern?rq=1 – Steve Nov 17 '15 at 16:51
  • @Steve Very interesting, indeed. Reading it now. – astralmaster Nov 17 '15 at 16:55
  • Sorry it is slightly off topic as it relates to email addresses but it was a good read! – Steve Nov 17 '15 at 16:57
  • Could he confuse it by using double byte characters? – Steve Nov 17 '15 at 16:59
  • No, also a violation of "rules". – astralmaster Nov 17 '15 at 16:59
  • Also @jon's answer here http://stackoverflow.com/questions/17725993/php-regex-delimiter?rq=1 but I guess they don't count as they are like double bytes – Steve Nov 17 '15 at 17:10
  • The first `preg` only allows `\.(jpeg|jpg|gif|png|bmp|jpe)$` everything else is `0`/`false`. So, for what the second `preg` ? It looks comletely useless. – bobble bubble Nov 17 '15 at 17:25
  • @bobblebubble Precaution, in case there really is a vulnerability similar to "null byte". – astralmaster Nov 17 '15 at 17:27
  • @AlanMoore Not sure about any specific OS, but space is treated like any other ASCII character in terms of preg_match(). – astralmaster Nov 17 '15 at 17:37
  • Perhaps you could add a preg_replace to avoid filename.p#hp – Steve Nov 18 '15 at 09:59
  • @Steve "#" is treated no different from the rest of ASCII characters. filename.p#hp will not pass this check. – astralmaster Nov 18 '15 at 12:39
  • What if the filename were encoded using url encoding and/or eval() such that it would pass the regex, and afterwards would be eval'ed and/or reencoded to output the original filename? – Bardicer Nov 19 '15 at 16:51
  • It is vulnerable. I have discovered a truly marvellous proof of this, which this margin is too narrow to contain. – Alfwed Nov 19 '15 at 16:51
  • @Nick eval() is not used in the code. URL encoding does not pass this regex. – astralmaster Nov 19 '15 at 16:55
  • `$var = "script.php\n#image.jpg"` ? – Alex Blex Nov 19 '15 at 17:30
  • @AlexBlex This has a .jpg ending and therefore will pass the regex. What you are implying (newline character) has been fixed in php long time ago. This would not work in any of the following: copy(), file_put_contents(), include(), etc. – astralmaster Nov 19 '15 at 17:43
  • @AlexBlex That would work only if `m` flag is on. – revo Nov 19 '15 at 17:46
  • `PCRE_MULTILINE` is off by default in php – astralmaster Nov 19 '15 at 17:48
  • @astralmaster it still works with `system` & Co, if you brave enough to pass unescaped parameters =) – Alex Blex Nov 19 '15 at 18:06
  • @AlexBlex using `system` is dangerous for multiple reasons. In my case it's not applicable. – astralmaster Nov 19 '15 at 18:11
  • Am I missing something? $var = "test.php?bypass=test.jpg"; – hendr1x Nov 19 '15 at 18:56
  • @hendr1x That will simply produce error complaining about '?' character if used in `copy()` or `include()` or `file_put_contents()` – astralmaster Nov 19 '15 at 19:07
  • You know, no use chasing ghost's. Give the string and code to a third party. Verify it will work. There is no reason to use 2 regex, so that has nothing to do with it. If the first is _true_ the next can't be true unless `$_FILES["image"]["name"]` changes on the stack. Also, always use `\A` and `\z` anchors in production code. –  Nov 19 '15 at 20:40
  • @sln Having used this regex pattern in hundreds of cases and dozens of projects and then being told it is vulnerable and can be bypassed.. you can understand my concern. – astralmaster Nov 19 '15 at 20:49
  • @astralmaster - You can trust after 30 years of regex, I'm not too much .. –  Nov 19 '15 at 20:54
  • @sln Why did you say `always use \A and \z anchors in production code`? Anything special? – revo Nov 19 '15 at 21:05
  • @revo - Mostly it un-ambiguously means absolute start/end of string even in multi-line mode. –  Nov 19 '15 at 21:10
  • @astralmaster I have some insights but first - which PHP version are you using? – Shlomi Hassid Nov 20 '15 at 03:40
  • @Shlomi Hassid 5.3.29 – astralmaster Nov 20 '15 at 05:25
  • @CasimiretHippolyte Do we have `R` modifier in PCRE? What is that? – revo Nov 20 '15 at 07:29
  • 1
    @revo It's `D (PCRE_DOLLAR_ENDONLY)` http://php.net/manual/en/reference.pcre.pattern.modifiers.php – Mariano Nov 20 '15 at 08:29
  • @revo: oops indeed, It's D, Mariano is right. – Casimir et Hippolyte Nov 20 '15 at 09:18
  • @JonStirling: I don't know why you ask this question, but in PCRE by default (i.e. without the m modifier) `$` matches the end of the string or the position before the newline when the newline is the last character (at the end of the string). In other words, the pattern `a$` matches the string `"a"` or the string `"a\n"` but not the string `"a\nb"`. To strictly match the end of the string with $, you must add the `D` modifier. – Casimir et Hippolyte Nov 20 '15 at 09:19
  • So, did your co-worker come up with the same as Shlomi Hassid or something different? Is the bet won or lost? – Steve Dec 05 '15 at 16:11
  • @Steve Turned out the 'vulnerability' only presents itself on Windows. Nevertheless, it did exactly what my coworker told me it would - passed the regex check and saved the file with executable extension. I added the actual method to my post above. – astralmaster Dec 08 '15 at 06:50
  • @astralmaster - twenty dollars well spent then! Thought it was something like that with the # idea but that was as far as I got. Nothing is ever quite safe in this game is it? – Steve Dec 08 '15 at 10:49
  • @astralmaster - so putting the filename through a process to remove `#` and `:` before the regex wouldn't help? ` str_replace(array('#', ':', "\0"), ' ', $imgname ); ` – Steve Dec 11 '15 at 14:45
  • @Steve It would in this particular case, but there could also be other ASCII characters in question. To be on a safe side I recommend re-processing the file (image) with GD and saving the processed file with randomly generated name. – astralmaster Dec 29 '15 at 12:26
  • @astralmaster - belt and bracers is always good! Great, lively discussion! Thanks. – Steve Dec 29 '15 at 12:36
  • @anubhava I think you misread the script. The two `preg_match()` calls are different. The first is checking for a match, the second it checking for no match. Condensing the two patterns into a single alternation is not the desired logic. – mickmackusa Apr 04 '23 at 00:21

4 Answers4

11

There is not a single step or a full direct way to exploit your code but here are some thoughts.

You are passing it to copy() in this example but you have mentioned that you have been using this method to validate file ext awhile now so I assume you had other cases that may have used this procedure with other functions too on different PHP versions.

Consider this as a test procedure (Exploiting include, require):

$name = "test.php#.txt";
if (preg_match('/\.(xml|csv|txt)$/i', $name) && preg_match('/\.(asp|jsp|php)$/i', $name) == false) {
    echo "in!!!!";
    include $name;
} else {
    echo "Invalid data file";
}

This will end up by printing "in!!!!" and executing 'test.php' even if it is uploaded it will include it from the tmp folder - of course that in this case you are already owned by the attacker but let's consider this options too. It's not a common scenario for an uploading procedure but it's a concept that can be exploited by combining several methods:

Let's move on - If you execute:

//$_FILES['image']['name'] === "test.php#.jpg";
$name = $_FILES['image']['name'];
if (preg_match('/\.(jpeg|jpg|gif|png|bmp|jpe)$/i', $name) && preg_match('/\.(asp|jsp|php)$/i', $name) == false) {
    echo "in!!!!";
    copy($_FILES['image']['tmp_name'], "../uploads/".$name);
} else {
    echo "Invalid image file";
}

Again perfectly fine. The file is copied into "uploads" folder - you can't access it directly (since the web server will strip down the right side of the #) but you injected the file and the attacker might find a way or another weak point to call it later.

An example for such execution scenario is common among sharing and hosting sites where the files are served by a PHP script which (in some unsafe cases) may load the file by including it with the wrong type of functions such as require, include, file_get_contents that are all vulnerable and can execute the file.

NULL byte The null byte attacks were a big weakness in php < 5.3 but was reintroduced by a regression in versions 5.4+ in some functions including all the file related functions and many more in extensions. It was patched several times but it's still out there and alot of older versions are still in use. In case you are handling with an older php version you are definitely Exposed:

//$_FILES['image']['name'] === "test.php\0.jpg";
$name = $_FILES['image']['name'];
if (preg_match('/\.(jpeg|jpg|gif|png|bmp|jpe)$/i', $name) && preg_match('/\.(asp|jsp|php)$/i', $name) == false) {
    echo "in!!!!";
    copy($_FILES['image']['tmp_name'], "../uploads/".$name);
} else {
    echo "Invalid image file";
}

Will print "in!!!!" and copy your file named "test.php".

The way php fixed that is by checking the string length before and after passing it to more deeper C procedure that creates the actual char array and by that if the string is truncated by the null byte (which indicates end of string in C) the length will not match. read more

Strangely enough even in patched and modern PHP releases it's still out there:

$input = "foo.php\0.gif";
include ($input); // Will load foo.php :)

My Conclusion: Your method of validating file extensions can be improved significantly - Your code allows a PHP file called test.php#.jpg to pass through while it shouldn't. Successful attacks are mostly executed by combining several vulnerabilities even minor ones - you should consider any unexpected outcome and behavior as one.

Note: there are many more concerns about file names and pictures cause they are many time included in pages later on and if they are not filtered correctly and included safely you expose yourself to many more XSS stuff but that's out of topic.

Shlomi Hassid
  • 6,500
  • 3
  • 27
  • 48
  • I got you already. `copy` is fine. most people do like this. But still was thinking when can anyone `include` image file in PHP code? (Mainly that could execute the PHP code) – Somnath Muluk Nov 20 '15 at 08:56
  • References for `but was reintroduced by a regression in versions 5.4+ in some functions` ? – astralmaster Nov 20 '15 at 09:08
  • if you restrict it to images than mostly not but its a common beginner mistakes that may serve files back to user of any type by require or include instead of `readfile` - or in a template base system once you loaded the file you can find a page that loads its template from the based on a url param and than call your injected file... use your imagination :) – Shlomi Hassid Nov 20 '15 at 09:08
  • @astralmaster php internals and mailing list - but there are some other blogs give me a sec – Shlomi Hassid Nov 20 '15 at 09:10
  • Best null byte regression ref: [bugs.php](http://php.net/ChangeLog-5.php) hit Ctrl+F and type null bytes - all the path related bug fixes are related to this issue exactly. – Shlomi Hassid Nov 20 '15 at 09:15
  • Why would the attacker use `test.php\0.jpg` if he can just use `test.php.jpg`? Sorry, newb here.. – choz Nov 20 '15 at 09:19
  • @choz read the "read more" I added - its interesting and will get you on track - with `test.php\0.jpg` regex will accept and `copy()` will only see the test.php part the rest is truncated - that's relevant only for older versions and for unmaintained copies. – Shlomi Hassid Nov 20 '15 at 09:22
  • @ShlomiHassid Wow.. That's dangerous shit dude.. Thanks for pointing out for me.. +1 – choz Nov 20 '15 at 09:24
  • 1
    @choz if you wand dangerous keep track of the bug reports and revisions and maybe you can even contribute. :) – Shlomi Hassid Nov 20 '15 at 09:28
  • `/\.(jpe?g?|gif|png|bmp)$/i` and `/\.([aj]sp|php)$/i`, but I don't understand how `preg_match('/\.(asp|jsp|php)$/i', $name)` could ever be true if the first `preg_match()` call found a match. – mickmackusa Apr 03 '23 at 22:30
2

Try this code.

$allowedExtension = array('jpeg','png','bmp'); // make list of all allowed extension

if(isset($_FILES["image"]["name"])){
     $filenameArray = explode('.',$_FILES["image"]["name"]);
     $extension = end($filenameArray);
     if(in_array($extension,$allowedExtension)){
        echo "allowed extension";
     }else{
          echo "not allowed extension";
     }
}
vivek s vamja
  • 1,001
  • 10
  • 11
0

preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred.

$var = "test.php";
if (preg_match('/\.(jpeg|jpg|gif|png|bmp|jpe)$/i', $var) === 1 
    && preg_match('/\.(asp|jsp|php)$/i', $var) !== 1) 
{
    echo "No way you have extension .php or .jsp or .asp after this check.";
} else{
    echo "Invalid file";
}

So when you are going to check with your code, use === 1.

Ideally you should use.

function isImageFile($file) {
    $info = pathinfo($file);
    return in_array(strtolower($info['extension']), 
                    array("jpg", "jpeg", "gif", "png", "bmp"));
}
Community
  • 1
  • 1
Somnath Muluk
  • 55,015
  • 38
  • 216
  • 226
0

I remember that in certains version in PHP < 5.3.X, PHP allows strings to contain 0x00, this char is considered as the end of string
So, for exemple, if your string contains : myfile.exe\0.jpg, so preg_match() will match jpg, but other PHP functions will stop in myfile.exe, like include() or copy() functions

Halayem Anis
  • 7,654
  • 2
  • 25
  • 45