147

If I run this:

/([^\/]+)+/g.exec('/a/b/c/d');

I get this:

["a", "a"]

But if I run this:

'/a/b/c/d'.match(/([^\/]+)+/g);

Then I get the expected result of this:

["a", "b", "c", "d"]

What's the difference?

Ry-
  • 218,210
  • 55
  • 464
  • 476
Justin Warkentin
  • 9,856
  • 4
  • 35
  • 35
  • 4
    you loop with `exec` to get all sub-selections. – zzzzBov Feb 09 '12 at 16:30
  • 3
    Note that the second `+` is not needed since `match` will return all sub-expressions already. `.exec` only returns one each time, so it doesn't need that `+` either. – pimvdb Feb 09 '12 at 16:34
  • 4
    On top of that, nested quantifiers like the two pluses should be used extremely carefully because they easily lead to [catastrophic backtracking](http://www.regular-expressions.info/catastrophic.html). – Marius Schulz Sep 08 '13 at 20:02
  • 1
    @MariusSchulz Thanks for the link. That lead me on to learn about possessive quantifiers and atomic grouping. Very nice things to understand. – Justin Warkentin Sep 12 '13 at 19:06

7 Answers7

132

exec with a global regular expression is meant to be used in a loop, as it will still retrieve all matched subexpressions. So:

var re = /[^\/]+/g;
var match;

while (match = re.exec('/a/b/c/d')) {
    // match is now the next match, in array form.
}

// No more matches.

String.match does this for you and discards the captured groups.

Ry-
  • 218,210
  • 55
  • 464
  • 476
  • 45
    I have something to add to this answer, one should not place the regular expression literal within the while condition, like this `while(match = /[^\/]+/g.exec('/a/b/c/d')` or it will create a infinite loop!. As it's clearly stated in the MDN https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec – yeyo Jan 27 '15 at 03:32
  • 9
    @yeyo: More specifically, it has to be the same regular expression object. A literal doesn’t accomplish that. – Ry- Jan 27 '15 at 05:12
  • 1
    @Ry- I think one should note this behavior was introduced in ES5. Before ES5 `new RegExp("pattern")` and `/pattern/` meant different things. – Robert Jun 21 '20 at 09:49
83

One picture is better, you know...

re_once = /([a-z])([A-Z])/
re_glob = /([a-z])([A-Z])/g

st = "aAbBcC"

console.log("match once="+ st.match(re_once)+ "  match glob="+ st.match(re_glob))
console.log("exec once="+ re_once.exec(st) + "   exec glob="+ re_glob.exec(st))
console.log("exec once="+ re_once.exec(st) + "   exec glob="+ re_glob.exec(st))
console.log("exec once="+ re_once.exec(st) + "   exec glob="+ re_glob.exec(st))

See the difference?

Note: To highlight, notice that captured groups(eg: a, A) are returned after the matched pattern (eg: aA), it's not just the matched pattern.

mplungjan
  • 169,008
  • 28
  • 173
  • 236
georg
  • 211,518
  • 52
  • 313
  • 390
32

If your regex is global, and you are capturing, then you must use exec. Match won't return all your captures.

Match works great for when just matching(not capturing). You run it once and it gives an array of all the matches. (though if the regex is not global, then match will show the match followed by captures)

Exec is what you use when you are capturing, and each time it is executed it gives the match, followed by the captures. (match will behave in a manner of giving the full match followed by captures, only when the regex is not global).

Another use with Exec, is getting the index or position, of a match. When you have a variable for your regex, you can use .lastIndex and get the position of the matching. A regex object has .lastIndex, and the regex object is what you do .exec on. Dot match is done on a string and you won't be able to then do regex object dot lastIndex

A string, has the match function, which is passed a regex. And a regex, has the exec function, and is passed a string

exec you run multiple times. match you run once

It's good to use match when not capturing and when capturing you can use exec which is more powerful as it is good for getting captures, but if you did use match when capturing, see that it shows captures when the regex is not global, but doesn't show captures when the regex is global.

> "azb".match(/a(z)b/);
[ "azb", "z" ]

> "azb".match(/a(z)b/g);
[ "azb" ]
>

Another thing is that if you use exec, note that's called on the regex, then if you used a variable for the regex, you have more power

You don't get the matches when you don't use the variable for the regex, so use the variable for the regex, when using exec

> /./g.exec("abc")
[ "a" ]
> /./g.exec("abc")
[ "a" ]
> /./g.exec("abc")
[ "a" ]
>
> /[a-c]/g.exec("abc")
[ "a" ]
> /[a-c]/g.exec("abc")
[ "a" ]
>

> var r=/[a-c]/g
> r.exec("abc")
[ "a" ]
> r.exec("abc")
[ "b" ]
> r.exec("abc")
[ "c" ]
> r.exec("abc")
null
>

And with exec, you can get the "index" of the match

> var r=/T/g
> r.exec("qTqqqTqqTq");
[ "T" ]
> r.lastIndex
2
> r.exec("qTqqqTqqTq");
[ "T" ]
> r.lastIndex
6
> r.exec("qTqqqTqqTq");
[ "T" ]
> r.lastIndex
9
> r.exec("qTqqqTqqTq");
null
> r.lastIndex
0
>

So if you want indexes or capturing, then use exec (bear in mind that as you can see, with the "index", the "index" it gives is really an nth occurrence, it's counting from 1. So you could derive the proper index by subtracting 1. And as you can see it gives 0 - lastIndex of 0 - for not found).

And if you want to stretch match, you can use it when you are capturing, but not when the regex is global, and when you do it for that, then the contents of the array aren't all the matches, but are the full match followed by the captures.

DecPK
  • 24,537
  • 6
  • 26
  • 42
barlop
  • 12,887
  • 8
  • 80
  • 109
  • Yes, understanding the working of `r.lastIndex` is the key factor to understand the difference between `exec` and `match`. – runsun Feb 15 '19 at 17:22
  • @barlop "Match won't match all captures", seriously? "a,b,c,aa,bb,cc".match(/(\w+)/g) => ["a", "b", "c", "aa", "bb", "cc"]. How to explain that it cached all of them? – MrHIDEn Sep 25 '19 at 11:16
  • @barlop `If your regex is global, and you are capturing, then you must use exec. Match won't return all your captures.` I got it on the console. Just copy/paste `"a,b,c,aa,bb,cc".match(/(\w+)/g);` Opera, Firefox. – MrHIDEn Sep 25 '19 at 12:33
  • @MrHIDEn I wouldn't you use the language you did in your wrong quote. And what matters is what is shown and what we are able to see.. whether there is any caching behind the scenes is also not relevant. And It has been a while since I looked into this, but It doesn't show all captures.. Even if you do your example `"a,b,c,aa,bb,cc".match(/(\w+)/g)` What is happening there is it shows all matches, and it just so happens that you captured every match, so if it were to show all captures, it would look exactly the same(cntd) – barlop Sep 25 '19 at 12:33
  • (cntd) So maybe you are thinking it is showing the captures, but it is not, it's showing the matches – barlop Sep 25 '19 at 12:34
  • @MrHIDEn compare `"abc".match(/a(.)c/g)` with `"abc".match(/(.)/g)` In the `a(.)c` case, you see it hasn't shown specifically what `\1` is,it just showed the match `a(.)c` which is the match `a.c` – barlop Sep 25 '19 at 12:35
  • @barlop Ok, I got it. `"a,b,c,aa,bb,cc".match(/(\w+),/g);` is wrong `["a,", "b,", "c,", "aa,", "bb,"]` should be `["a", "b", "c", "aa", "bb", "cc"]`. You are right. – MrHIDEn Sep 25 '19 at 12:37
  • @MrHIDEn yes I removed that comment as soon as I noticed you suddenly changed your long find pattern, and removed the comment just before your reply But yeah you agree now with my answer – barlop Sep 25 '19 at 12:41
32

/regex/.exec() returns only the first match found, while "string".match() returns all of them if you use the g flag in the regex.

See here: exec, match.

Alex Ciminian
  • 11,398
  • 15
  • 60
  • 94
7

The .match() function str.match(regexp) will do the following:

  • if there is a match it will return:
    • if the g flag is used in the regexp: it will return all the substrings (ignoring capture groups)
    • if the g flag is not used in the regexp: it will return the same as regexp.exec(str)
  • if there is no match it will return:
    • null

Examples of .match() using the g flag:

var str = "qqqABApppabacccaba";
var e1, e2, e3, e4, e5;
e1 = str.match(/nop/g); //null
e2 = str.match(/no(p)/g); //null
e3 = str.match(/aba/g); //["aba", "aba"]
e4 = str.match(/aba/gi); //["ABA", "aba", "aba"]
e5 = str.match(/(ab)a/g); //["aba", "aba"] ignoring capture groups as it is using the g flag

And .match() without the g flag is equivalent to .exec():

e1=JSON.stringify(str.match(/nop/))===JSON.stringify(/nop/.exec(str)); //true
//e2 ... e4 //true
e5=JSON.stringify(str.match(/(ab)a/))===JSON.stringify(/(ab)a/.exec(str)); //true

The .exec() function regexp.exec(str) will do the following:

  • if there is a match it will return:
    • if the g flag is used in the regexp: it will return (for each time it is called): [N_MatchedStr, N_Captured1, N_Captured2, ...] of the next N match. Important: it will not advance into the next match if the regexp object is not stored in a variable (it needs to be the same object)
    • if the g flag is not used in the regexp: it will return the same as if it had a g flag and was called for the first time and only once.
  • if there is no match it will return:
    • null

Example of .exec() (stored regexp + using the g flag = it changes with each call):

var str = "qqqABApppabacccaba";
var myexec, rgxp = /(ab)a/gi;

myexec = rgxp.exec(str);
console.log(myexec); //["ABA", "AB"]
myexec = rgxp.exec(str);
console.log(myexec); //["aba", "ab"]
myexec = rgxp.exec(str);
console.log(myexec); //["aba", "ab"]
myexec = rgxp.exec(str);
console.log(myexec); //null

//But in this case you should use a loop:
var mtch, myRe = /(ab)a/gi;
while(mtch = myRe.exec(str)){ //infinite looping with direct regexps: /(ab)a/gi.exec()
    console.log("elm: "+mtch[0]+" all: "+mtch+" indx: "+myRe.lastIndex);
    //1st iteration = elm: "ABA" all: ["ABA", "AB"] indx: 6
    //2nd iteration = elm: "aba" all: ["aba", "ab"] indx: 12
    //3rd iteration = elm: "aba" all: ["aba", "ab"] indx: 18
}

Examples of .exec() when it is not changing with each call:

var str = "qqqABApppabacccaba", myexec, myexec2;

//doesn't go into the next one because no g flag
var rgxp = /(a)(ba)/;
myexec = rgxp.exec(str);
console.log(myexec); //["aba", "a", "ba"]
myexec = rgxp.exec(str);
console.log(myexec); //["aba", "a", "ba"]
//... ["aba", "a", "ba"]

//doesn't go into the next one because direct regexp
myexec2 = /(ab)a/gi.exec(str);
console.log(myexec2); //["ABA", "AB"]
myexec2 = /(ab)a/gi.exec(str);
console.log(myexec2); //["ABA", "AB"]
//... ["ABA", "AB"]
ajax333221
  • 11,436
  • 16
  • 61
  • 95
0

Sometimes regex.exec() will take much more time then string.match().

It is worth to mention that if the outcome of string.match() and regex.exec() are the same (ex: when not using \g flag), regex.exec() will take somewhere between x2 to x30 then string.match():

Therefore in such cases, using the approach of "new RegExp().exec()" should be used only when you need a global regex (i.e. to execute more than once).

dorony
  • 1,008
  • 1
  • 14
  • 31
0

Both do the same, difference is one is standalone expression second method of string. If You want, need, must do the same operation with given expression then exec is better, because You don't need to create new expression. The constructor of expression is not running doubled. Match could be used in chain operation on string, like:

match chain:
some_string.replace(expr).match(expr); //etc...

With exec:
let exec = new RegExp(regexp );
for(for expression ){
   exec.(string );
   //other body of FOR
}