25

I'm defining a regex object and then matching it in a loop. It only matches sometimes, to be precise - every second time. So I created a smallest working sample of this problem.

I tried this code in Opera and Firefox. The behavior is the same in both:

>>> domainRegex = /(?:\.|^)([a-z0-9\-]+\.[a-z0-9\-]+)$/g;
/(?:\.|^)([a-z0-9\-]+\.[a-z0-9\-]+)$/g 
>>> domainRegex.exec('mail-we0-f174.google.com');
Array [".google.com", "google.com"]
>>> domainRegex.exec('mail-we0-f174.google.com');
null
>>> domainRegex.exec('mail-we0-f174.google.com');
Array [".google.com", "google.com"]
>>> domainRegex.exec('mail-we0-f174.google.com');
null
>>> domainRegex.exec('mail-we0-f174.google.com');
Array [".google.com", "google.com"]
>>> domainRegex.exec('mail-we0-f174.google.com');
null

Why is this happening? Is this behaviour documented? Is there a way around, other than defining the regex inside loop body?

GDR
  • 2,301
  • 1
  • 21
  • 26
  • 3
    https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec#Finding_successive_matches – Passerby Aug 27 '13 at 10:22
  • 1
    possible duplicate of [Why RegExp with global flag in Javascript give wrong results?](http://stackoverflow.com/questions/1520800/why-regexp-with-global-flag-in-javascript-give-wrong-results) – Bergi Aug 27 '13 at 10:23
  • @GDR this is happening because of [RegExp.lastIndex](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/lastIndex?redirectlocale=en-US&redirectslug=JavaScript%2FReference%2FGlobal_Objects%2FRegExp%2FlastIndex)(_read "description" section_). – Mr_Green Aug 27 '13 at 10:37
  • @Jack I thought the answer was the same in the related question? – Ozzy Aug 27 '13 at 12:41
  • @Ozzy same explanation why it works that way, but different answer in the sense that the wrong approach is made :) – Ja͢ck Aug 27 '13 at 15:32
  • Because it's called `exec` and not `getMatches`! – Simon_Weaver Dec 22 '20 at 20:32

5 Answers5

25

exec() works in the manner you have described; with the /g modifier present, it will return a match, starting from lastIndex with every invocation until there are no more matches, at which point it returns null and the value of lastIndex is reset to 0.

However, because you have anchored the expression using $ there won't be more than one match, so you can use String.match() instead and lose the /g modifier:

var domainRegex = /(?:\.|^)([a-z0-9\-]+\.[a-z0-9\-]+)$/;
'mail-we0-f174.google.com'.match(domainRegex); // [".google.com", "google.com"]
Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
6

Additional Info to Ja͢cks response:

You can also set lastIndex

var myRgx = /test/g;
myRgx.exec(someString);
myRgx.lastIndex = 0;

or just create a new regex for each execution, which i find even cleaner

new RegExp(myRgx).exec(someString);
Community
  • 1
  • 1
xiphe
  • 496
  • 6
  • 10
3

When performing a global search with a RegExp, the exec method starts matching beginning at the lastIndex property. The lastIndex property is set at each exec invocation and is set to the position following the last match found. If a match fails, lastIndex is reset to 0, which causes exec to match from the start again.

var a = 'asdfeeeasdfeedxasdf'
undefined
var p = /asdf/g
p.lastIndex
4
p.exec(a)
["asdf"]
p.lastIndex
11
p.exec(a)
["asdf"]
p.lastIndex
19
p.exec(a)
null //match failed
p.lastIndex
0 //lastIndex reset. next match will start at the beginning of the string a

p.exec(a)
["asdf"]
c.P.u1
  • 16,664
  • 6
  • 46
  • 41
2

Each time you run the exec method of your regex it gets you the next match.

Once it reaches the end of the string, it returns null to let you know you've got all of the matches. The next time, it starts again from the begining.

As you only have one match (which returns an array of the full match and the match from the brackets), The first time, the regex starts searching from the start. It finds a match and returns it. The next time, it gets to the end and returns null. So if you had this in a loop, you could do something like this to loop through all matches:

while(regExpression.exec(string)){
    // do something
}

Then the next time, it starts again from position 0.

"Is there a way around?"

Well, if you know there's only one match, or you only want the first match, you can save te result to a variable. There's no need to resuse .exec. If you are interested in all the matches, then you need to keep going until you get null.

ColBeseder
  • 3,579
  • 3
  • 28
  • 45
0

why don't you use simple match method for string like

'mail-we0-f174.google.com'.match(/(?:\.|^)([a-z0-9\-]+\.[a-z0-9\-]+)$/)
dirtydexter
  • 1,063
  • 1
  • 10
  • 17
  • 1
    Because it has bad performance to create a new instance of Regex object with every iteration of a loop. – GDR Aug 28 '13 at 09:09