0

I have the following regex in PHP:

/(?<=\')[^\'\s][^\']*+(?=\')|(?<=")[^"\s][^"]*+(?=")|[^\'",\s]+/

and I would like to port it to javascript like:

var regex = new RegExp('/(?<=\')[^\'\s][^\']*+(?=\')|(?<=")[^"\s][^"]*+(?=")|[^\'",\s]+/');

var match = regex.exec("hello,my,name,is,'mr jim'")

for( var z in match) alert(match[z]);

There is something that JavaScript doesnt like here, but I have no idea what it is. I've tried looking for diferences between PHP and JS regex via regular-expressions.info but I cant see anything obvious.

Any help would be greatly appreciated

Thank you again

Edit: The problem seems to lie within the positive lookbehind's but does this mean it cannot be ported?

Jamie Bicknell
  • 2,306
  • 17
  • 35

3 Answers3

2

Correct - the positive lookbehinds will not work.

But, just as some general information about regex in Javascript, here's a couple pointers for you.

You don't have to use the RegExp object - you can use pattern literals instead

var regex = /^[a-z\d]+$/i;

But if you use the RegExp object, you have to escape your backslashes since your pattern is now locked in a string.

var regex = new RegExp( '^[a-z\\d]+$', 'i' );

The primary benefit of the RegExp object is if there is a dynamic bit to your pattern, for example

var max = 4;
var regex = new RegExp( '\d{1,' + max + '}' );
Peter Bailey
  • 105,256
  • 31
  • 182
  • 206
  • Thank you Peter, as you can probably tell I normally stick to PHP, and very rarely use regex in either language. Thank you for your advice, will note this down for future reference. – Jamie Bicknell Sep 11 '09 at 15:09
1

it's (?<=) positive look-behind what Javascript doesn't support. but be aware that Javascript implementation in different browsers vary significantly.

Edit: there is an SO question devoted to workaround.

Community
  • 1
  • 1
SilentGhost
  • 307,395
  • 66
  • 306
  • 293
1

You don't get lookbehind (and lookahead has problems in IE, so is best avoided too). But it's easy to just let those ' and " characters be part of the match, and throw them out afterwards:

var value= "hello,my,name,is,'mr jim'";
var match;
var r= /'[^'\s][^']*'|"[^"\s][^"]*"|[^'",\s]+/g;

while(match= r.exec(value)) {
    var text= match[0];
    if ('"\''.indexOf(text.charAt(0))!=-1) // starts with ' or "?
        text= text.substring(1, text.length-1);
    alert(text);
}

Or, use capturing parentheses to isolate the quotes from the text:

var r= /'([^'\s][^']*)'|"([^"\s][^"]*)"|([^'",\s]+)/g;

while (match= r.exec(value)) {
    var text= match[1] || match[2] || match[3];
    alert(text);
}

(I'm guessing your for(var z in match) was supposed to loop over each pattern match in the string. Unfortunately JavaScript doesn't quite work that easily.)

This may not be the best way to parse a comma-separated list; it seems a bit ill-defined for cases where you have a space or quote in the middle of a field. A simple string-indexing parser might be a better bet.

bobince
  • 528,062
  • 107
  • 651
  • 834
  • This is perfect. I stripped out the regex earlier of the lookbehinds and then used a replace function to remove the quote marks, your methods are far better! Thank you! – Jamie Bicknell Sep 11 '09 at 16:15