6

what is the difference between this regular expressions are the replaceable?

((?:[^\"])*)


([^\"]*)

background to this question:

The javascript WYSIWYG editor (tinymce) fails to parse my html code in Firefox (23.0.1 and 25.0a2) but works in in Chrome.

I found the regular expression to blame:

attrRegExp = /([\w:\-]+)(?:\s*=\s*(?:(?:\"((?:[^\"])*)\")|(?:\'((?:[^\'])*)\')|([^>\s]+)))?/g;

which I modified, replacing

((?:[^\"])*) 

with

([^\"]*)

and

((?:[^\'])*) 

with

([^\']*)

the resulting regular expression is working in both browsers for my test case

attrRegExp = /([\w:\-]+)(?:\s*=\s*(?:(?:\"([^\"]*)\")|(?:\'([^\']*)\')|([^>\s]+)))?/g

can someone put some light on that?

my test data that only works with the modified regular expression is a big image >700 kb like:

var testdata = '<img alt="" src="...5PmDk4FOGOHy6S3JW120W1uCJ5M0PBa54edOFAc8ePX/2Q==">'

doing something like that to test:

testdata.match(attrRegExp);

especially when the test data is big the unmodified regex is likely to fail in firefox.

You can find the jsfiddle example here:

key_
  • 577
  • 4
  • 15

1 Answers1

5

There should be no difference in the result. So you should be fine.

However, there might be a big difference in how RegExp engines will process these two expressions, and in the case of Firefox/Safari you just proved there actually is ;)

Firefox makes use of WebKit/JavaScriptCore YARR. YARR imposes an arbitrary, artificial limit, which hits in the non-capturing group variant

// The below limit restricts the number of "recursive" match calls in order to
// avoid spending exponential time on complex regular expressions.
static const unsigned matchLimit = 1000000;

As such Safari is affected as well.

See the relevant Webkit bug and relevant Firefox bug and the nice test case comparing different expression types somebody put together.

nmaier
  • 32,336
  • 5
  • 63
  • 78