2

I want to match a PHP regex string.

From what I know, they are always in the format (correct me if I am wrong):

/                 One opening forward slash
the expression    Any regular expression
/                 One closing forward slash
[imsxe]           Any number of the modifiers NOT REPEATING

My expression for this was:

^/.+/[imsxe]{0,5}$

Written as a PHP string, (with the open/close forward slash and escaped inner forward slashes) it is this:

$regex = '/^\/.+\/[imsxe]{0,5}$/';

which is:

^                 From the beginning
/                 Literal forward slash
.+                Any character, one or more
/                 Literal forward slash
[imsxe]{0,5}      Any of the chars i,m,s,x,e, 0-5 times (only 5 to choose from)
$                 Until the end

This works, however it allows repeating modifiers, i.e:

This: ^/.+/[imsxe]{0,5}$

Allows this: '/blah/ii'
Allows this: '/blah/eee'
Allows this: '/blah/eise'
etc...

When it should not.

I personally use RegexPal to test, because its free and simple.

If (in order to help me) you would like to do the same, click the link above (or visit http://regexpal.com), paste my expression in the top text box

^/.+/[imsxe]{0,5}$

Then paste my tests in the bottom textbox

/^[0-9]+$/i
/^[0-9]+$/m
/^[0-9]+$/s
/^[0-9]+$/x
/^[0-9]+$/e

/^[0-9]+$/ii
/^[0-9]+$/mm
/^[0-9]+$/ss
/^[0-9]+$/xx
/^[0-9]+$/ee

/^[0-9]+$/iei
/^[0-9]+$/mim
/^[0-9]+$/sis
/^[0-9]+$/xix
/^[0-9]+$/eie

ensure you click the second checkbox at the top where it says '^$ match at line breaks (m)' to enable the multi-line testing.

Thanks for the help

Edit

After reading comments about Regex often having different delimiters i.e

/[0-9]+/  == #[0-9]+#

This is not a problem and can be factored in to my regex solution.

All I really need to know is how to prevent duplicate characters!

Edit

This bit isn't so important but it provides context

The need for such a feature is simple...

I'm using jQuery UI MultiSelect Widget written by Eric Hynds.

Simple demo found here

Now In my application, I'm extending the plugin so that certain options popup a little menu on the right when hovered. The menu that pops up can be ANY html element.

I wanted multiple options to be able to show the same element. So my API works like this:

$('#select_element_id')
// Erics MultiSelect API
.multiselect({
    // MultiSelect options
})
// My API
.multiselect_side_pane({
    menus: [
        {
            // This means, when an option with value 'MENU_1' is hovered,
            // the element '#my_menu_1' will be shown. This makes attaching
            // menus to options REALLY SIMPLE
            menu_element: $('#my_menu_1'),
            target: ['MENU_1']
        },
        // However, lets say we have option value 'USER_ID_132', I need the
        // target name to be dynamic. What better way to be dynamic than regex?
        {
            menu_element: $('#user_details_box'),
            targets: ['USER_FORM', '/^USER_ID_[0-9]+$/'],
            onOpen: function(target)
            {
                // here the TARGET can be interrogated, and the correct
                // user info can be displayed

                // Target will be 'USER_FORM' or 'USER_ID_3' or 'USER_ID_234'
                // so if it is USER_FORM I can clear the form ready for input,
                // and if the target starts with 'USER_ID_', I can parse out
                // the user id, and display the correct user info!  
            }
        }
    ]
});

So as you can see, The whole reason I need to know if a string a regex, is so in the widget code, I can decide whether to treat the TARGET as a string (i.e. 'USER_FORM') or to treat the TARGET as an expression (i.e '/^USER_ID_[0-9]+$/' for USER_ID_234')

Community
  • 1
  • 1
AlexMorley-Finch
  • 6,785
  • 15
  • 68
  • 103
  • regex patterns can use almost any delimiter. / seems to be very common but so is ~ and #. – AbraCadaver Dec 12 '13 at 15:39
  • The '/^\/.+\/[imsxe]{0,5}$/' pattern does match: '/blah/eise', '/blah/eee' and '/blah/ii'. So i dont see the problem. All strings are beginning with a slash, contain a few characters followed by a slash and any of the characters 0-5 times. – user202172 Dec 12 '13 at 15:44
  • Matching regexes is a difficult job. You might take a look at this [answer](http://stackoverflow.com/a/172316). Also, there are more modifiers than the ones you specified, say for example `U`, so make sure to check this [list](http://php.net/manual/en/reference.pcre.pattern.modifiers.php). Also it might not be surprising if there was one not listed in there. Another issue, is if there is an escaped delimiter like `/fail\/test/`. TL;DR There's a lot to take into account ... – HamZa Dec 12 '13 at 15:48
  • I'm terribly curious to know the use case driving this question... – Alex Howansky Dec 12 '13 at 17:01
  • @AlexHowansky Please read my edits:) – AlexMorley-Finch Dec 13 '13 at 11:26
  • 2
    From the PHP documentation: "...it is also possible to use bracket style delimiters where the opening and closing brackets are the starting and ending delimiter, respectively. `{this is a pattern}`" – mcrumley Dec 13 '13 at 18:57

2 Answers2

1

Unfortunately, the regexp string can be "anything". The forward slashes you talk about can be a lot of characters. i.e. a hash (#) will also work.

Secondly, to match up to 5 characters without having them double could probably be done with lookahead / lookbehind etc, but will create such complex regexp that it's faster to post-process it.

It is possibly faster to search for the regular expression functions (preg_match, preg_replace etc.) in code to be able to deduct where regular expressions are used.

$var = '#placeholder#';

Is a valid regular expression in PHP, but doesn't have to be one, where:

const ESCAPECHAR = '#';
$var = 'text';
$regexp = ESCAPECHAR . $var . ESCAPECHAR;

Is also valid, but might not be seen as such.

Ronald Swets
  • 1,669
  • 10
  • 16
0

In order to prevent duplicate in modifier section, I'd do:

^/.+/(?:(?=[^i]*i[^i]*)?(?=[^m]*m[^m]*)?(?=[^s]*s[^s]*)?(?=[^x]*x[^x]*)?(?=[^e]*e[^e]*)?)?$
Toto
  • 89,455
  • 62
  • 89
  • 125