76

Thank you in advance for you time in helping with this issue..

preg_match(): Compilation failed: invalid range in character class at offset 20 session.php on line 278

This stopped working all of a sudden after months of working, after a PHP upgrade on our server.

Here is the code

    else{
     /* Spruce up username, check length */
     $subuser = stripslashes($subuser);
     if(strlen($subuser) < $config['min_user_chars']){
        $form->setError($field, "* Username below ".$config['min_user_chars']."characters");
     }
     else if(strlen($subuser) > $config['max_user_chars']){
        $form->setError($field, "* Username above ".$config['max_user_chars']."characters");
     }


     /* Check if username is not alphanumeric */
    /* PREG_MATCH CODE */

     else if(!preg_match("/^[a-z0-9]([0-9a-z_-\s])+$/i", $subuser)){        
        $form->setError($field, "* Username not alphanumeric");
     }


    /* PREG_MATCH CODE */


     /* Check if username is reserved */
     else if(strcasecmp($subuser, GUEST_NAME) == 0){
        $form->setError($field, "* Username reserved word");
     }
     /* Check if username is already in use */
     else if($database->usernameTaken($subuser)){
        $form->setError($field, "* Username already in use");
     }
     /* Check if username is banned */
     else if($database->usernameBanned($subuser)){
        $form->setError($field, "* Username banned");
     }
  }
miken32
  • 42,008
  • 16
  • 111
  • 154
user3841888
  • 771
  • 1
  • 5
  • 3

5 Answers5

136

The problem is really old, but there are some new developments related to PHP 7.3 and newer versions that need to be covered. PHP PCRE engine migrates to PCRE2, and the PCRElibrary version used in PHP 7.3 is 10.32, and that is where Backward Incompatible Changes originate from:

  • Internal library API has changed
  • The 'S' modifier has no effect, patterns are studied automatically. No real impact.
  • The 'X' modifier is the default behavior in PCRE2. The current patch reverts the behavior to the meaning of 'X' how it was in PCRE, but it might be better to go with the new behavior and have 'X' turned on by default. So currently no impact, too.
  • Some behavior change due to the newer Unicode engine was sighted. It's Unicode 10 in PCRE2 vs Unicode 7 in PCRE. Some behavior change can be sighted with invalid patterns.

Acc. to the PHP 10.33 changelog:

  1. With PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL set, escape sequences such as \s which are valid in character classes, but not as the end of ranges, were being treated as literals. An example is [_-\s] (but not [\s-_] because that gave an error at the start of a range). Now an "invalid range" error is given independently of PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL.

Before PHP 7.3, you might use the hyphen in a character class in any position if you escaped it, or if you put it "in a position where it cannot be interpreted as indicating a range". In PHP 7.3, it seems the PCRE2_EXTRA_BAD_ESCAPE_IS_LITERAL was set to false. So, from now on, in order to put hyphen into a character class, always use it either at the start or end positions only.

See also this reference:

In simple words,

PCRE2 is more strict in the pattern validations, so after the upgrade, some of your existing patterns could not compile anymore.

Here is the simple snippet used in php.net

preg_match('/[\w-.]+/', ''); // this will not work in PHP7.3
preg_match('/[\w\-.]+/', ''); // the hyphen need to be escaped

As you can see from the example above there is a little but substantial difference between the two lines.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    http://sandbox.onlinephpfunctions.com/code/7e98237a86c7c0822ce3fbd5323b3b849e43ae4e – Edmunds22 Jan 06 '20 at 11:32
  • @Wiktor Stribizew Can you please help me why this expression is not working in PHP 7.4.5? It has badly paused us not migrating our system from PHP 5.6 to PHP 7.x. I'll be thankful If you can please help. :( /([\w-:\*]*)(?:\#([\w-]+)|\.([\w-]+))?(?:\[@?(!?[\w-:]+)(?:([!*^$]?=)["']?(.*?)["']?)?\])?([\/, ]+)/is – Saeed Afzal Apr 26 '20 at 04:48
  • @SaeedAfzal You have unmatched parentheses, `)?` has no first opening `(`. Also, all your `[\w-:*]` must be replaced with `[\w:*-]`. Please consider asking a new question. Ah, I see it, you asked [How do I convert PHP 5 regular expression to PHP 7 standards?](https://stackoverflow.com/q/61435992/3832970). Let's move the conversation there. – Wiktor Stribiżew Apr 26 '20 at 09:20
  • We have upgraded from PHP 7.2 to PHP7.3 recently and encounter similar issues, it is a pity that php upgrades in minor versions have as much non backward compatibilities... – ElLocoCocoLoco May 29 '20 at 13:08
  • awesome, you made it for me on php 7.4 – saleh asadi Nov 13 '20 at 20:17
34

A character class range is defined by using - between two values in a character class ([] in regex). [0-9] means everything between 0 and 9, inclusive. In the regular expression in your code, you have several character class ranges, a-z, 0-9. There is also one class that you probably didn't mean to put there, namely _-\s.

"/^[a-z0-9]([0-9a-z_-\s])+$/i"
                   ^^^^ 

This is apprently not considered an invalid character range in some (most?) versions of PCRE (the regular expression library PHP uses), but it might have changed recently, and if the PCRE library was upgraded on the server, that might be the reason.

Debuggex is a nice tool that can help debug errors (well, the error message from PHP told you both the line and the character where the error was, so..) like this (I'm not affiliated, just a fan).

MatsLindh
  • 49,529
  • 4
  • 53
  • 84
  • 10
    ...or PHP itself was upgraded. According to RegexBuddy, PHP 5.5 requires the hyphen to be escaped or moved to the end of the list if you want it to match a literal hyphen. Before that, apparently, it just assumed you meant that because `_-\s` makes no sense as a range. – Alan Moore Jul 15 '14 at 20:06
  • Yes, PHP bundles a version of PCRE as well, so that would end up with the same issue. Good catch. – MatsLindh Jul 15 '14 at 20:55
  • @AlanMoore: a too little know possibility is to put the hyphen immediatly after a shorthand character class: `\s-_` – Casimir et Hippolyte Jul 16 '14 at 01:16
  • @Cas: Yes, and PHP accepts that, but why? It thinks it's safe to assume `\s-_` isn't meant as a range, and it used to feel the same about `_-\s`, but now it's not sure. WTF? I say just put the hyphen at the end; that works in every version of every flavor I know of. – Alan Moore Jul 16 '14 at 11:37
  • WOW thank you all for the comments and suggestions etc. My first experience with this site has been great. Hopefully I can contribute in some way in the future. This "/^[a-z0-9]([0-9a-z_-\s])+$/i" -------- minus _-\s worked exactly the way it should have.. not sure how that got in there.. "/^[a-z0-9]([0-9a-z])+$/i" – user3841888 Jul 16 '14 at 13:30
  • 1
    @AlanMoore: you can write this too: `[a-z-0-9]` or `[a-z-1]`. Then the "rule" with PCRE seems to be: *In a character class, you need to escape the literal hyphen except at the begining of the class or after the negation caret, at the end, after a range or after and before a shorthand character class.* In other words, you don't need to escape the hyphen when the situation is not ambiguous, except for invalid ranges. – Casimir et Hippolyte Jul 16 '14 at 14:37
  • Also interesting to note: our web servers experienced this problem with PHP 5.5.13. I tested this on my Mac and a vagrant VM (PHP 5.5.9 and 5.5.18, respectively.) and the issue did not occur. Perhaps it was introduced in a maintenance release and subsequently removed? – KellyHuberty Jan 19 '15 at 23:41
  • Correction: I am seeing that behavior on 5.5.18. I'm a little confused, because the issue happens on the command line but not in apache. – KellyHuberty Jan 20 '15 at 18:01
  • 4
    Found the same problem here... in the production server that has not been updated with the latest PHP the code works as always, by in the testing server I got the error. In my situation, I needed to keep reference for the empty space [\s] so I escaped the hypen [\-\s] and solve the problem and works as expected also. Just an idea. – raphie Aug 21 '15 at 22:31
  • 1
    I just encountered this problem with the dash after a shorthand character class (`\d-.`), so apparently PHP no longer accepts that as of version 7.3.1. – Brilliand Mar 13 '19 at 21:30
29

Your error is dependent on your regex interpreter.

You should escape the hyphen to clarify that it's a character. So using \- instead of -.

Your final code:

/^[a-z0-9]([0-9a-z_\-\s])+$/i
Moradnejad
  • 3,466
  • 2
  • 30
  • 52
7

Maybe this answer can save somebody with Arabic/Farsi Slug creation:

For php version is 7.3 use \- instead of -

[^a-z0-9_\s-

and

"/[\s-_]+/"

So for arabic make_slug function for php 7.3:

function make_slug($string, $separator = '-')
{
    $string = trim($string);
    $string = mb_strtolower($string, 'UTF-8');

    // Make alphanumeric (removes all other characters)
    // this makes the string safe especially when used as a part of a URL
    // this keeps latin characters and Persian characters as well
    $string = preg_replace("/[^a-z0-9_\s\-ءاآؤئبپتثجچحخدذرزژسشصضطظعغفقكکگلمنوهی]/u", '', $string);

    // Remove multiple dashes or whitespaces or underscores
    $string = preg_replace("/[\s\-_]+/", ' ', $string);

    // Convert whitespaces and underscore to the given separator
    $string = preg_replace("/[\s_]/", $separator, $string);

    return $string;
}
Tarek Adra
  • 500
  • 5
  • 12
-3

i have this error and i solve it by doing this

Route::get('{path}','HomeController@index')->where( 'path', '([A-z]+)?' );

and it is work for me.

cs95
  • 379,657
  • 97
  • 704
  • 746
draw134
  • 1,053
  • 4
  • 35
  • 84