25

I have this text

txt = "Local residents o1__have called g__in o22__with reports...";

in which I need to get the list of numbers between each o and __

If I do

txt.match(/o([0-9]+)__/g);

I will get

["o1__", "o22__"]

But I'd like to have

["1", "22"]

How can I do that ?

Pierre de LESPINAY
  • 44,700
  • 57
  • 210
  • 307

3 Answers3

31

See this question:

txt = "Local residents o1__have called g__in o22__with reports...";
var regex = /o([0-9]+)__/g
var matches = [];
var match = regex.exec(txt);
while (match != null) {
    matches.push(match[1]);
    match = regex.exec(txt);
}
alert(matches);
Community
  • 1
  • 1
Bobby
  • 18,217
  • 15
  • 74
  • 89
18

You need to use .exec() on a regular expression object and call it repeatedly with the g flag to get successive matches like this:

var txt = "Local residents o1__have called g__in o22__with reports...";
var re = /o([0-9]+)__/g;
var matches;
while ((matches = re.exec(txt)) != null) {
    alert(matches[1]);
}

The state from the previous match is stored in the regular expression object as the lastIndex and that's what the next match uses as a starting point.

You can see it work here: http://jsfiddle.net/jfriend00/UtF6J/

Using the regexp this way is described here: https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/RegExp/exec.

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • I don't know which one to choose yours is more detailed but @Soldier.moth 's is the first :) – Pierre de LESPINAY Sep 02 '11 at 08:08
  • Perhaps a couple of typos, but Soldier's code won't work as it only ever calls `regex.exec()` once (it has to get called multiple times as part of the loop) and it will go into an infinite loop if it matches anything as the value of `match` is never changed once the while loop starts. – jfriend00 Sep 02 '11 at 08:13
  • Whoops! Definitely typo, fixed. – Bobby Sep 02 '11 at 08:18
  • If i call to validate this code re.test(txt) and then try do while, it stars at second match and we lost the first match. – fdrv Mar 15 '16 at 04:38
  • @Jek-fdrv - Yes, if you're using the `g` option on your regex, then both `.test()` and `.exec()` advance one match down the target string each time you call them. That state is stored in the regex object itself. You can reset that state if you want by setting the `.lastIndex` property on the regex back to `0`. – jfriend00 Mar 15 '16 at 04:59
4
/o([0-9]+?)__/g

This should work. Click here and search for "lazy star".

var rx = new RegExp( /o([0-9]+?)__/g );
var txt = "Local residents o1__have called g__in o22__with reports...";
var mtc = [];
while( (match = rx.exec( txt )) != null ) {
        alert( match[1] );
        mtc.push(match[1]);
}

Jek-fdrv pointed out in the comments, that if you call rx.test just before the while loop some results are skipped. That's because RegExp object contains a lastIndex field that keeps track of last match's index in the string. When lastIndex changes then RegExp keeps matching by starting from it's lastIndex value, therefore a part of the string is skipped. A little example may help:

var rx = new RegExp( /o([0-9]+?)__/g );
var txt = "Local residents o1__have called g__in o22__with reports...";
var mtc = [];
console.log(rx.test(txt), rx.lastIndex); //outputs "true 20"
console.log(rx.test(txt), rx.lastIndex); //outputs "true 43"
console.log(rx.test(txt), rx.lastIndex); //outputs "false 0" !!!
rx.lastIndex = 0; //manually reset lastIndex field works in Chrome
//now everything works fine
while( (match = rx.exec( txt )) != null ) {
        console.log( match[1] );
        mtc.push(match[1]);
}
CaNNaDaRk
  • 1,302
  • 12
  • 20
  • Giving me the same result. Are lazy things implemented in javascript ? – Pierre de LESPINAY Sep 02 '11 at 08:03
  • Yes, they are. I edited and added a little code, tested locally and works for me. Shows two alerts with "1" and "22" – CaNNaDaRk Sep 02 '11 at 08:37
  • Now it populates mtc array too. – CaNNaDaRk Sep 02 '11 at 08:42
  • If i call to validate this code rx.test(txt) and then try do while, it stars at second match and we lost the first match. – fdrv Mar 15 '16 at 04:38
  • That's right, everytime you call test method the lastIndex member of the RegExp object increases to next match position, if you don't reset it (by setting it to 0) next time you call for match or test the string is analyzed only from lastIndex value. Try to set "rx.lastIndex = 0;" just after you call for test and then call match again. That works on Chrome. I'll edit the answer adding a little example. – CaNNaDaRk Mar 16 '16 at 11:04