81

I'd like to test the validity of a regular expression in PHP, preferably before it's used. Is the only way to do this actually trying a preg_match() and seeing if it returns FALSE?

Is there a simpler/proper way to test for a valid regular expression?

alex
  • 479,566
  • 201
  • 878
  • 984
Ross McFarlane
  • 4,054
  • 4
  • 36
  • 52
  • 7
    Do you mean something like: http://stackoverflow.com/questions/172303/is-there-a-regular-expression-to-detect-a-valid-regular-expression ? – Jon Oct 16 '12 at 23:10
  • Why don't you want to check preg_match() against false? – rubo77 Oct 19 '12 at 18:31
  • Some answers do not consider that MAYBE the regex to be validated comes from the input of an admin user of an app... MAYBE the app has a "contact_types" table with a "regex" field... – J. Bruni Mar 01 '14 at 05:25

12 Answers12

156
// This is valid, both opening ( and closing )
var_dump(preg_match('~Valid(Regular)Expression~', '') === false);
// This is invalid, no opening ( for the closing )
var_dump(preg_match('~InvalidRegular)Expression~', '') === false);

As the user pozs said, also consider putting @ in front of preg_match() (@preg_match()) in a testing environment to prevent warnings or notices.

To validate a RegExp just run it against null (no need to know the data you want to test against upfront). If it returns explicit false (=== false), it's broken. Otherwise it's valid though it need not match anything.

So there's no need to write your own RegExp validator. It's wasted time...

chris
  • 3,019
  • 23
  • 21
CodeAngry
  • 12,760
  • 3
  • 50
  • 57
  • 1
    Just a guess: The OP said "I'd like to test the validity of a regular expression in PHP, *preferably before it's used.*" – JDB Oct 18 '12 at 14:17
  • 7
    @Cyborgx37 So what? I gave him the solution with a NULL. You don't need to now the string you'll use it against. You just need to know the pattern to see if it's correct. If it'll match or not... that's a different story and depends on your target string. What did I say wrong? – CodeAngry Oct 18 '12 at 15:09
  • 1
    I didn't downvote you... just guessing why someone might have. I think your answer is fine. – JDB Oct 18 '12 at 16:02
  • 5
    Note, that in the case of an invalid regexp, your code will show up a warning, which is too bad for only testing the expression - you should protect you `preg_match()` call with `@` – pozs Oct 23 '12 at 11:31
  • 2
    The error suppression operator isn't a good solution because of issues with unit testing frameowrks and other disabling the "@" operator for testing. As workaround you can use "set_error_handler" before and "restore_error_handler" after that test ;) – mabe.berlin Oct 23 '12 at 15:33
  • @mabe I don't use it either. But it's good for beginners. Rumor has it that there's also a small performance penalty with it. But beginners are terrified by errors so it's good to mention it, until they learn to properly suppress/log/handle the errors. – CodeAngry Oct 23 '12 at 15:36
  • 1
    Maybe worth mentioning, you can use any string for `$subject` in `preg_match()` as `null`s are converted to empty strings too (`preg_match('~.?~', null) === 1`), so that's just a method to test a regular expression, not a method to test it before it's used (w/ or w/o *real data*) – pozs Mar 11 '14 at 09:31
  • Hey, I used this solution for check regex `[0-9]{4}` and it is throwing warning of `Warning: preg_match(): Unknown modifier '{'`. Warning was gone with `@` but the `preg_match` returns `false` as a result although regex is valid. Can anyone please help me out for this? – Binal Gajjar Sep 05 '19 at 05:50
  • @BinalGajjar Try capturing it in a ([0-9]{4}). – CodeAngry Sep 22 '19 at 16:36
  • 3
    The second argument of preg_match is string not null. Passing null will fail when strict types are declared. – ya.teck Jan 15 '21 at 08:06
  • 1
    @ya.teck According to my discoveries and your comment - just use an empty string instead. – codekandis Jul 11 '21 at 12:52
27

I created a simple function that can be called to checking preg

function is_preg_error()
{
    $errors = array(
        PREG_NO_ERROR               => 'Code 0 : No errors',
        PREG_INTERNAL_ERROR         => 'Code 1 : There was an internal PCRE error',
        PREG_BACKTRACK_LIMIT_ERROR  => 'Code 2 : Backtrack limit was exhausted',
        PREG_RECURSION_LIMIT_ERROR  => 'Code 3 : Recursion limit was exhausted',
        PREG_BAD_UTF8_ERROR         => 'Code 4 : The offset didn\'t correspond to the begin of a valid UTF-8 code point',
        PREG_BAD_UTF8_OFFSET_ERROR  => 'Code 5 : Malformed UTF-8 data',
    );

    return $errors[preg_last_error()];
}

You can call this function using the follow code :

preg_match('/(?:\D+|<\d+>)*[!?]/', 'foobar foobar foobar');
echo is_preg_error();

Alternative - Regular Expression Online Tester

Wahyu Kristianto
  • 8,719
  • 6
  • 43
  • 68
  • 2
    This is just a wrapper around `preg_last_error` specific for the English language. – Alin Purcaru Oct 22 '12 at 20:39
  • By the way, this will not really tell you if regex is valid or not. Consider this: php > preg_match("/aaa",""); php > echo preg_last_error(); 0 – Alex N. Oct 06 '13 at 05:41
  • PHP 7 added `PREG_JIT_STACKLIMIT_ERROR`. See the [docs](https://www.php.net/manual/en/function.preg-last-error.php). – mbomb007 Apr 01 '19 at 21:11
15

If you want to dynamically test a regex preg_match(...) === false seems to be your only option. PHP doesn't have a mechanism for compiling regular expressions before they are used.

Also you may find preg_last_error an useful function.

On the other hand if you have a regex and just want to know if it's valid before using it there are a bunch of tools available out there. I found rubular.com to be pleasant to use.

Alin Purcaru
  • 43,655
  • 12
  • 77
  • 90
6

You can check to see if it is a syntactically correct regex with this nightmare of a regex, if your engine supports recursion (PHP should).

You cannot, however algorithmically tell if it will give the results you want without running it.

From: Is there a regular expression to detect a valid regular expression?

/^((?:(?:[^?+*{}()[\]\\|]+|\\.|\[(?:\^?\\.|\^[^\\]|[^\\^])(?:[^\]\\]+|\\.)*\]|\((?:\?[:=!]|\?<[=!]|\?>)?(?1)??\)|\(\?(?:R|[+-]?\d+)\))(?:(?:[?+*]|\{\d+(?:,\d*)?\})[?+]?)?|\|)*)$/
Community
  • 1
  • 1
2

Without actually executing the regex you have no way to be sure if it's be valid. I've recently implemented a similar RegexValidator for Zend Framework. Works just fine.

<?php
class Nuke_Validate_RegEx extends Zend_Validate_Abstract
{
    /**
     * Error constant
     */
    const ERROR_INVALID_REGEX = 'invalidRegex';

    /**
     * Error messages
     * @var array
     */
    protected $_messageTemplates = array(
        self::ERROR_INVALID_REGEX => "This is a regular expression PHP cannot parse.");

    /**
     * Runs the actual validation
     * @param string $pattern The regular expression we are testing
     * @return bool
     */
    public function isValid($pattern)
    {
        if (@preg_match($pattern, "Lorem ipsum") === false) {
            $this->_error(self::ERROR_INVALID_REGEX);
            return false;
        }
        return true;
    }
}
ChrisR
  • 14,370
  • 16
  • 70
  • 107
1

You can validate your regular expression with a regular expression and up to a certain limit. Checkout this stack overflow answer for more info.

Note: a "recursive regular expression" is not a regular expression, and this extended version of regex doesn't match extended regexes.

A better option is to use preg_match and match against NULL as @Claudrian said

Community
  • 1
  • 1
rajukoyilandy
  • 5,341
  • 2
  • 20
  • 31
1

So in summary, for all those coming to this question you can validate regular expressions in PHP with a function like this.

preg_match() returns 1 if the pattern matches given subject, 0 if it does not, or FALSE if an error occurred. - PHP Manual

/**
 * Return an error message if the regular expression is invalid
 *
 * @param string $regex string to validate
 * @return string
 */
function invalidRegex($regex)
{
    if(preg_match($regex, null) !== false)
    {
        return '';
    }

    $errors = array(
        PREG_NO_ERROR               => 'Code 0 : No errors',
        PREG_INTERNAL_ERROR         => 'Code 1 : There was an internal PCRE error',
        PREG_BACKTRACK_LIMIT_ERROR  => 'Code 2 : Backtrack limit was exhausted',
        PREG_RECURSION_LIMIT_ERROR  => 'Code 3 : Recursion limit was exhausted',
        PREG_BAD_UTF8_ERROR         => 'Code 4 : The offset didn\'t correspond to the begin of a valid UTF-8 code point',
        PREG_BAD_UTF8_OFFSET_ERROR  => 'Code 5 : Malformed UTF-8 data',
    );

    return $errors[preg_last_error()];
}

Which can be used like this.

if($error = invalidRegex('/foo//'))
{
    die($error);
}
Xeoncross
  • 55,620
  • 80
  • 262
  • 364
1

I am not sure if it supports PCRE, but there is a Chrome extension over at https://chrome.google.com/webstore/detail/cmmblmkfaijaadfjapjddbeaoffeccib called RegExp Tester. I have not used it as yet myself so I cannot vouch for it, but perhaps it could be of use?

Tash Pemhiwa
  • 7,590
  • 4
  • 45
  • 49
1

just use the easy way - look if the preg_match is return a false value:

//look is a regex or not
$look = "your_regex_string";

if (preg_match("/".$look."/", "test_string") !== false) {
    //regex_valid
} else {
    //regex_invalid
}
Ensai Tankado
  • 189
  • 1
  • 4
0

I'd be inclined to set up a number of unit tests for your regex. This way not only would you be able to ensure that the regex is indeed valid but also effective at matching.

I find using TDD is an effective way to develop regex and means that extending it in the future is simplified as you already have all of your test cases available.

The answer to this question has a great answer on setting up your unit tests.

Community
  • 1
  • 1
Rob Forrest
  • 7,329
  • 7
  • 52
  • 69
  • Thanks Rob. I'm a fan of TDD, but the code in question needs to be able to validate regex as an input (I'm validating JSON schemas, that can contain regex patterns). – Ross McFarlane May 13 '14 at 08:39
0

You should try to match the regular expression against NULL. If the result is FALSE (=== FALSE), there was an error.

In PHP >= 5.5, you can use the following to automatically get the built-in error message, without needing to define your own function to get it:

// For PHP >= 8, use the built-in str_ends_with() instead of this function.
// Taken from https://www.php.net/manual/en/function.str-ends-with.php#126551
if (!function_exists('str_ends_with')) {
    function str_ends_with(string $haystack, string $needle): bool {
        $needle_len = strlen($needle);
        return ($needle_len === 0 || 0 === substr_compare($haystack, $needle, - $needle_len));
    }
}

function test_regex($regex) {
    preg_match($regex, NULL);
    $constants = get_defined_constants(true)['pcre'];
    foreach ($constants as $key => $value) {
        if (!str_ends_with($key, '_ERROR')) {
            unset($constants[$key]);
        }
    }
    return array_flip($constants)[preg_last_error()];
}

Attempt This Online

Note that the call to preg_match() will still throw a warning for invalid regular expressions. The warning can be caught with a custom error handler using set_error_handler().

See Can I try/catch a warning?.

mbomb007
  • 3,788
  • 3
  • 39
  • 68
  • `array_flip` isn't safe to use on the output from `get_defined_constants(true)['pcre']` in later versions of PHP since one of the constants has a value of `TRUE` (boolean), and that will result in a warning being emitted. – GuyPaddock Apr 16 '21 at 00:16
  • @GuyPaddock You are correct; thanks for pointing that out. I have improved the code to account for that by filtering the defined constants first. See the link to try the code in an online interpreter. – mbomb007 Apr 16 '21 at 17:12
-3

According to the PCRE reference, there is no such way to test validity of an expression, before it's used. But i think, if someone use an invalid expression, it's a design error in that application, not a run-time one, so you should be fine.

pozs
  • 34,608
  • 5
  • 57
  • 63
  • -1 as the OP's question was not before it's used but before it's used against real data. One thing is testing with 5MB of scraped data and another with an empty string that just verifies that the RegExp compiles. SO: In PHP you can test it, in C++ you can test it. In C++11 you get an exception thrown if it's invalid, in PHP you get the explicit false. Any RegExp is compiled before it's executed and that's where the errors occur as its compilation fails when it's illegal regardless if the data you'll use it against. Sheesh... – CodeAngry Oct 23 '12 at 17:00
  • @Claudrian: i agree, but the question was not *before it's used against real data*, but *preferably before it's used* - and there is no such way (as an explicit / dedicated function). But i agree, if one really want to test it, it should be tested against null / empty value. – pozs Oct 23 '12 at 17:10