275

In javascript, is there an equivalent of String.indexOf() that takes a regular expression instead of a string for the first first parameter while still allowing a second parameter ?

I need to do something like

str.indexOf(/[abc]/ , i);

and

str.lastIndexOf(/[abc]/ , i);

While String.search() takes a regexp as a parameter it does not allow me to specify a second argument!

Edit:
This turned out to be harder than I originally thought so I wrote a small test function to test all the provided solutions... it assumes regexIndexOf and regexLastIndexOf have been added to the String object.

function test (str) {
    var i = str.length +2;
    while (i--) {
        if (str.indexOf('a',i) != str.regexIndexOf(/a/,i)) 
            alert (['failed regexIndexOf ' , str,i , str.indexOf('a',i) , str.regexIndexOf(/a/,i)]) ;
        if (str.lastIndexOf('a',i) != str.regexLastIndexOf(/a/,i) ) 
            alert (['failed regexLastIndexOf ' , str,i,str.lastIndexOf('a',i) , str.regexLastIndexOf(/a/,i)]) ;
    }
}

and I am testing as follow to make sure that at least for one character regexp, the result is the same as if we used indexOf

//Look for the a among the xes
test('xxx');
test('axx');
test('xax');
test('xxa');
test('axa');
test('xaa');
test('aax');
test('aaa');

Ned Batchelder
  • 364,293
  • 75
  • 561
  • 662
Pat
  • 36,282
  • 18
  • 72
  • 87
  • `|` inside `[ ]` matches the literal character `|`. You probably meant `[abc]`. – Markus Jarderot Nov 07 '08 at 22:51
  • yes thanks you are right, I will fix it but the regexp itself is irrelevant... – Pat Nov 07 '08 at 23:37
  • I found a simpler and effective approach is to just use string.match(/[A-Z]/). If there is no much, the method returns null, otherwise you get an object, you can do match(/[A-Z]/).index to get the index of first capital letter – Syler Mar 26 '19 at 19:14

22 Answers22

234

Instances of the String constructor have a .search() method which accepts a RegExp and returns the index of the first match.

To start the search from a particular position (faking the second parameter of .indexOf()) you can slice off the first i characters:

str.slice(i).search(/re/)

But this will get the index in the shorter string (after the first part was sliced off) so you'll want to then add the length of the chopped off part (i) to the returned index if it wasn't -1. This will give you the index in the original string:

function regexIndexOf(text, re, i) {
    var indexInSuffix = text.slice(i).search(re);
    return indexInSuffix < 0 ? indexInSuffix : indexInSuffix + i;
}
Web_Designer
  • 72,308
  • 93
  • 206
  • 262
Glenn
  • 7,874
  • 3
  • 29
  • 38
  • 2
    from the question: While String.search() takes a regexp as a parameter it does not allow me to specify a second argument! – Pat Nov 07 '08 at 22:17
  • 15
    str.substr(i).search(/re/) – Glenn Nov 08 '08 at 03:01
  • 7
    Great solution, however the output is a bit different. indexOf will return a number from the beginning (regardless of the offset), whereas this will return the position from the offset. So, for parity, you'll want something more like this: `function regexIndexOf(text, offset) { var initial = text.substr(offset).search(/re/); if(initial >= 0) { initial += offset; } return initial; }` – gkoberger Aug 10 '14 at 22:11
  • this code will not work if i is undefined (it is optional). So a bit more fool proof version with input check: regexIndexOf(text, re, i) { let idx = (i && i > 0) ? text.substr(i).search(re) : text.search(re); return idx < 0 ? idx : idx + (i?i:0); } – Stan Sokolov Oct 28 '21 at 14:07
157

Combining a few of the approaches already mentioned (the indexOf is obviously rather simple), I think these are the functions that will do the trick:

function regexIndexOf(string, regex, startpos) {
    var indexOf = string.substring(startpos || 0).search(regex);
    return (indexOf >= 0) ? (indexOf + (startpos || 0)) : indexOf;
}

function regexLastIndexOf(string, regex, startpos) {
    regex = (regex.global) ? regex : new RegExp(regex.source, "g" + (regex.ignoreCase ? "i" : "") + (regex.multiLine ? "m" : ""));
    if(typeof (startpos) == "undefined") {
        startpos = string.length;
    } else if(startpos < 0) {
        startpos = 0;
    }
    var stringToWorkWith = string.substring(0, startpos + 1);
    var lastIndexOf = -1;
    var nextStop = 0;
    while((result = regex.exec(stringToWorkWith)) != null) {
        lastIndexOf = result.index;
        regex.lastIndex = ++nextStop;
    }
    return lastIndexOf;
}

UPDATE: Edited regexLastIndexOf() so that is seems to mimic lastIndexOf() now. Please let me know if it still fails and under what circumstances.


UPDATE: Passes all tests found on in comments on this page, and my own. Of course, that doesn't mean it's bulletproof. Any feedback appreciated.

fregante
  • 29,050
  • 14
  • 119
  • 159
Jason Bunting
  • 58,249
  • 14
  • 102
  • 93
  • Your `regexLastIndexOf` will only return the index of the last *non-overlapping* match. – Markus Jarderot Nov 08 '08 at 03:32
  • Sorry, not a HUGE regex guy - can you give me an example that would make mine fail? I appreciate being able to learn more, but your response doesn't help someone as ignorant as I am. :) – Jason Bunting Nov 08 '08 at 04:40
  • Jason I just added some function to test in the question. this is failing (among other tests) the following 'axx'.lastIndexOf('a',2) != 'axx'.regexLastIndexOf(/a/,2) – Pat Nov 08 '08 at 12:02
  • Okay, I got it to pass that test and spent more time looking up relevant details. – Jason Bunting Nov 08 '08 at 19:33
  • "aaaaa".regexLastIndexOf(/aaa/). It would find the first three a's, then try to match again on the last two a's, which would fail. "aaaaa".lastIndexOf("aaa") finds the last three a's. – Markus Jarderot Nov 08 '08 at 22:37
  • Ah - gotcha. Well, I am done with this for now - I don't have the time to do anything further. :( It's been fun though. – Jason Bunting Nov 08 '08 at 23:03
  • Nevermind, I took another stab at it. :) More feedback appreciated. – Jason Bunting Nov 08 '08 at 23:13
  • I finally got time to benchmark the proposed solutions and yours came out on top so I am accepting it for now. – Pat Nov 01 '09 at 18:27
  • 2
    I think it's more efficient to use `regex.lastIndex = result.index + 1;` instead of `regex.lastIndex = ++nextStop;`. It will proceed to the next match much faster hopefully without loosing any result. – Gedrox May 30 '12 at 09:32
  • @Gedrox Yes, I think it has a quadratic time complexity without your suggestion, when it can have a linear complexity if the RegExp is short enough. – user1537366 Dec 16 '14 at 09:42
  • What about a situation where the string contains multiple JSON objects or multiple parts that fullfill the regex? – TeraTon Apr 16 '15 at 12:56
  • 2
    If you prefer to pull it from npm, these two util functions are now on NPM as: https://www.npmjs.com/package/index-of-regex – Capaj Oct 26 '16 at 19:40
  • The focus here seems to be on `regexLastIndexOf`, but `regexIndexOf` should use `RegExp::lastIndex` too, otherwise something like `/^./` will match anywhere, and lookbehind won't work on the boundary. – Codesmith Oct 04 '21 at 17:23
54

I have a short version for you. It works well for me!

var match      = str.match(/[abc]/gi);
var firstIndex = str.indexOf(match[0]);
var lastIndex  = str.lastIndexOf(match[match.length-1]);

And if you want a prototype version:

String.prototype.indexOfRegex = function(regex){
  var match = this.match(regex);
  return match ? this.indexOf(match[0]) : -1;
}

String.prototype.lastIndexOfRegex = function(regex){
  var match = this.match(regex);
  return match ? this.lastIndexOf(match[match.length-1]) : -1;
}

EDIT : if you want to add support for fromIndex

String.prototype.indexOfRegex = function(regex, fromIndex){
  var str = fromIndex ? this.substring(fromIndex) : this;
  var match = str.match(regex);
  return match ? str.indexOf(match[0]) + fromIndex : -1;
}

String.prototype.lastIndexOfRegex = function(regex, fromIndex){
  var str = fromIndex ? this.substring(0, fromIndex) : this;
  var match = str.match(regex);
  return match ? str.lastIndexOf(match[match.length-1]) : -1;
}

To use it, as simple as this:

var firstIndex = str.indexOfRegex(/[abc]/gi);
var lastIndex  = str.lastIndexOfRegex(/[abc]/gi);
pmrotule
  • 9,065
  • 4
  • 50
  • 58
  • This is actually a nice trick. WOuld be great if you expanded it to also take the `startIndex` parameter as usual `indeoxOf` and `lastIndexOf` do. – Robert Koritnik Apr 03 '15 at 14:35
  • @RobertKoritnik - I edited my answer to support `startIndex` (or `fromIndex`). Hope it helps! – pmrotule Apr 06 '15 at 07:19
  • `lastIndexOfRegex` should also add back the value of `fromIndex` to the result. – Peter Mar 20 '18 at 15:49
  • 2
    Your algorithm will broke up in the following scenario: `"aRomeo Romeo".indexOfRegex(new RegExp("\\bromeo", 'gi'));` The result will be 1 when it should be 7, because indexOf will look for the first time the "romeo" appears, no matter if it is at the beginning of a word or not. – Coral Kashri May 02 '20 at 12:49
  • Excellent trick. To handle the situation CoralK noted, you may replace the return statement of **indexOfRegex** by : `if(match){let list=this.split(regex);match.pop();list.pop();return match.join('').length+list.join('').length+(fromIndex||0);}else return -1;` – yorg Jul 03 '21 at 22:17
  • `lastIndexOfRegex` does not work with character ranges. `let s = 'alpha beta (gamma)', p = lastIndexOfRegex(s, new RegExp('[ ()]')), correct = (p == s.length - 1); ` It looks for the index of the last instance of _the FIRST match_ of the regex provided. That's a different thing than the index of the last match of the regex. – Cheeso Oct 04 '21 at 20:50
  • @Cheeso You need to provide the `g` flag otherwise it will stop searching after the first match. – pmrotule Oct 07 '21 at 10:57
  • @CoralK You can simply use match.index if you don't provide the `g` flag : `'aRomeo Romeo'.match(new RegExp('\\bromeo', 'i')).index` – pmrotule Oct 07 '21 at 11:00
10

Use:

str.search(regex)

See the documentation here.

rmg.n3t
  • 951
  • 2
  • 10
  • 16
  • 21
    @OZZIE: No, not really. It's basically [Glenn's answer](https://stackoverflow.com/a/273810/2427596) (with ~150 upvotes), except it has **no explanation** whatsoever, does **not support** starting position other than `0`, and was posted... **seven years** later. – ccjmne Jun 03 '18 at 17:00
7

You could use substr.

str.substr(i).match(/[abc]/);
Andru Luvisi
  • 24,367
  • 6
  • 53
  • 66
  • From the well-known JavaScript book published by O'Reilly: "substr has not been standardized by ECMAScript and is therefore deprecated." But I like the basic idea behind what you are getting at. – Jason Bunting Nov 07 '08 at 23:50
  • 1
    That's a non-issue. If you're REALLY concerned about it, use String.substring() instead - you just have to do the math a bit differently. Besides, JavaScript should not be 100% beholden to it's parent language. – Peter Bailey Nov 07 '08 at 23:59
  • It's not a non-issue - if you get your code running against an implementation that doesn't implement substr because they want to adhere to the ECMAScript standards, you are going to have problems. Granted, replacing it with substring is not that hard to do, but it is good to be cognizant of this. – Jason Bunting Nov 08 '08 at 00:36
  • 1
    The moment you have problems you have a very very simple solutions. I think the comments are sensible, but the down vote was pedantic. – VoronoiPotato Mar 28 '13 at 13:15
  • Could you please edit your answer to provide a working demo code? – vsync Jun 11 '18 at 22:08
7

Based on BaileyP's answer. The main difference is that these methods return -1 if the pattern can't be matched.

Edit: Thanks to Jason Bunting's answer I got an idea. Why not modify the .lastIndex property of the regex? Though this will only work for patterns with the global flag (/g).

Edit: Updated to pass the test-cases.

String.prototype.regexIndexOf = function(re, startPos) {
    startPos = startPos || 0;

    if (!re.global) {
        var flags = "g" + (re.multiline?"m":"") + (re.ignoreCase?"i":"");
        re = new RegExp(re.source, flags);
    }

    re.lastIndex = startPos;
    var match = re.exec(this);

    if (match) return match.index;
    else return -1;
}

String.prototype.regexLastIndexOf = function(re, startPos) {
    startPos = startPos === undefined ? this.length : startPos;

    if (!re.global) {
        var flags = "g" + (re.multiline?"m":"") + (re.ignoreCase?"i":"");
        re = new RegExp(re.source, flags);
    }

    var lastSuccess = -1;
    for (var pos = 0; pos <= startPos; pos++) {
        re.lastIndex = pos;

        var match = re.exec(this);
        if (!match) break;

        pos = match.index;
        if (pos <= startPos) lastSuccess = pos;
    }

    return lastSuccess;
}
Markus Jarderot
  • 86,735
  • 21
  • 136
  • 138
  • This seems the most promising so far (after a few sytax fixes) :-) Only failing a few tests on the edge conditions. Things like 'axx'.lastIndexOf('a',0) != 'axx'.regexLastIndexOf(/a/,0) ... I am looking into it to see if I can fix those cases – Pat Nov 08 '08 at 12:09
6

RexExp instances have a lastIndex property already (if they are global) and so what I'm doing is copying the regular expression, modifying it slightly to suit our purposes, exec-ing it on the string and looking at the lastIndex. This will inevitably be faster than looping on the string. (You have enough examples of how to put this onto the string prototype, right?)

function reIndexOf(reIn, str, startIndex) {
    var re = new RegExp(reIn.source, 'g' + (reIn.ignoreCase ? 'i' : '') + (reIn.multiLine ? 'm' : ''));
    re.lastIndex = startIndex || 0;
    var res = re.exec(str);
    if(!res) return -1;
    return re.lastIndex - res[0].length;
};

function reLastIndexOf(reIn, str, startIndex) {
    var src = /\$$/.test(reIn.source) && !/\\\$$/.test(reIn.source) ? reIn.source : reIn.source + '(?![\\S\\s]*' + reIn.source + ')';
    var re = new RegExp(src, 'g' + (reIn.ignoreCase ? 'i' : '') + (reIn.multiLine ? 'm' : ''));
    re.lastIndex = startIndex || 0;
    var res = re.exec(str);
    if(!res) return -1;
    return re.lastIndex - res[0].length;
};

reIndexOf(/[abc]/, "tommy can eat");  // Returns 6
reIndexOf(/[abc]/, "tommy can eat", 8);  // Returns 11
reLastIndexOf(/[abc]/, "tommy can eat"); // Returns 11

You could also prototype the functions onto the RegExp object:

RegExp.prototype.indexOf = function(str, startIndex) {
    var re = new RegExp(this.source, 'g' + (this.ignoreCase ? 'i' : '') + (this.multiLine ? 'm' : ''));
    re.lastIndex = startIndex || 0;
    var res = re.exec(str);
    if(!res) return -1;
    return re.lastIndex - res[0].length;
};

RegExp.prototype.lastIndexOf = function(str, startIndex) {
    var src = /\$$/.test(this.source) && !/\\\$$/.test(this.source) ? this.source : this.source + '(?![\\S\\s]*' + this.source + ')';
    var re = new RegExp(src, 'g' + (this.ignoreCase ? 'i' : '') + (this.multiLine ? 'm' : ''));
    re.lastIndex = startIndex || 0;
    var res = re.exec(str);
    if(!res) return -1;
    return re.lastIndex - res[0].length;
};


/[abc]/.indexOf("tommy can eat");  // Returns 6
/[abc]/.indexOf("tommy can eat", 8);  // Returns 11
/[abc]/.lastIndexOf("tommy can eat"); // Returns 11

A quick explanation of how I am modifying the RegExp: For indexOf I just have to ensure that the global flag is set. For lastIndexOf of I am using a negative look-ahead to find the last occurrence unless the RegExp was already matching at the end of the string.

Prestaul
  • 83,552
  • 10
  • 84
  • 84
4

It does not natively, but you certainly can add this functionality

<script type="text/javascript">

String.prototype.regexIndexOf = function( pattern, startIndex )
{
    startIndex = startIndex || 0;
    var searchResult = this.substr( startIndex ).search( pattern );
    return ( -1 === searchResult ) ? -1 : searchResult + startIndex;
}

String.prototype.regexLastIndexOf = function( pattern, startIndex )
{
    startIndex = startIndex === undefined ? this.length : startIndex;
    var searchResult = this.substr( 0, startIndex ).reverse().regexIndexOf( pattern, 0 );
    return ( -1 === searchResult ) ? -1 : this.length - ++searchResult;
}

String.prototype.reverse = function()
{
    return this.split('').reverse().join('');
}

// Indexes 0123456789
var str = 'caabbccdda';

alert( [
        str.regexIndexOf( /[cd]/, 4 )
    ,   str.regexLastIndexOf( /[cd]/, 4 )
    ,   str.regexIndexOf( /[yz]/, 4 )
    ,   str.regexLastIndexOf( /[yz]/, 4 )
    ,   str.lastIndexOf( 'd', 4 )
    ,   str.regexLastIndexOf( /d/, 4 )
    ,   str.lastIndexOf( 'd' )
    ,   str.regexLastIndexOf( /d/ )
    ]
);

</script>

I didn't fully test these methods, but they seem to work so far.

Peter Bailey
  • 105,256
  • 31
  • 182
  • 206
  • Updated to handle those cases – Peter Bailey Nov 07 '08 at 22:57
  • everytime i am about to accept this answer i find a new case ! These give different results! alert( [str.lastIndexOf( /[d]/, 4 ), str.regexLastIndexOf( /[d]/, 4 )]); – Pat Nov 07 '08 at 23:20
  • well, of course they are - str.lastIndexOf will do type coercion on the pattern - converting it into a string. The string "/[d]/" most certainly is not found in the input, so the -1 returned is actually accurate. – Peter Bailey Nov 07 '08 at 23:29
  • Got it. After reading the spec on String.lastIndexOf() - I just misunderstood how that argument worked. This new version should handle it. – Peter Bailey Nov 07 '08 at 23:39
  • Something is still not right, but it is getting to late ... I'll try to get a test case, and maybe fix it in the morning. Sorry for the trouble so far. – Pat Nov 07 '08 at 23:55
  • Ya - I see a fatal flaw in my approach for regexLastIndexOf() that MizardX's solution does better. I'll see if I can cobble something together that encapsulates all this – Peter Bailey Nov 08 '08 at 00:05
  • I just added the test function to the question ... this fails this test (among others) 'axx'.lastIndexOf('a',1) != 'axx'.regexLastIndexOf(/a/,1) – Pat Nov 08 '08 at 12:05
3

I needed a regexIndexOf function also for an array, so I programed one myself. However I doubt, that it's optimized, but i guess it should work properly.

Array.prototype.regexIndexOf = function (regex, startpos = 0) {
    len = this.length;
    for(x = startpos; x < len; x++){
        if(typeof this[x] != 'undefined' && (''+this[x]).match(regex)){
            return x;
        }
    }
    return -1;
}

arr = [];
arr.push(null);
arr.push(NaN);
arr[3] = 7;
arr.push('asdf');
arr.push('qwer');
arr.push(9);
arr.push('...');
console.log(arr);
arr.regexIndexOf(/\d/, 4);
jakov
  • 641
  • 5
  • 3
2

After having all the proposed solutions fail my tests one way or the other, (edit: some were updated to pass the tests after I wrote this) I found the mozilla implementation for Array.indexOf and Array.lastIndexOf

I used those to implement my version of String.prototype.regexIndexOf and String.prototype.regexLastIndexOf as follows:

String.prototype.regexIndexOf = function(elt /*, from*/)
  {
    var arr = this.split('');
    var len = arr.length;

    var from = Number(arguments[1]) || 0;
    from = (from < 0) ? Math.ceil(from) : Math.floor(from);
    if (from < 0)
      from += len;

    for (; from < len; from++) {
      if (from in arr && elt.exec(arr[from]) ) 
        return from;
    }
    return -1;
};

String.prototype.regexLastIndexOf = function(elt /*, from*/)
  {
    var arr = this.split('');
    var len = arr.length;

    var from = Number(arguments[1]);
    if (isNaN(from)) {
      from = len - 1;
    } else {
      from = (from < 0) ? Math.ceil(from) : Math.floor(from);
      if (from < 0)
        from += len;
      else if (from >= len)
        from = len - 1;
    }

    for (; from > -1; from--) {
      if (from in arr && elt.exec(arr[from]) )
        return from;
    }
    return -1;
  };

They seem to pass the test functions I provided in the question.

Obviously they only work if the regular expression matches one character but that is enough for my purpose since I will be using it for things like ( [abc] , \s , \W , \D )

I will keep monitoring the question in case someone provides a better/faster/cleaner/more generic implementation that works on any regular expression.

Pat
  • 36,282
  • 18
  • 72
  • 87
  • Wow, that is a long bit of code. Please check my updated answer and provide feedback. Thanks. – Jason Bunting Nov 08 '08 at 19:38
  • This implementation aims for absolute compatibility with lastIndexOf in Firefox and the SpiderMonkey JavaScript engine, including in several cases which are arguably edge cases. [...] in real-world applications, you may be able to calculate from with less complicated code if you ignore those cases. – Pat Nov 08 '08 at 22:26
  • Form the mozilla page :-) I just took the code ad change two lines leaving all the edge cases. Since a couple of the other answers were updated to pass the tests, I will try benchmarking them and accept the most efficent. When I have time to revisit the issue. – Pat Nov 08 '08 at 22:48
  • I updated my solution and appreciate any feedback or things that cause it to fail. I made a change to fix the overlapping problem pointed out by MizardX (hopefully!) – Jason Bunting Nov 08 '08 at 23:20
2

For a solution that's more concise than most of the other answers posted, you may want to use the String.prototype.replace function which will run a function on every detected pattern. For example:

let firstIndex = -1;
"the 1st numb3r".replace(/\d/,(p,i) => { firstIndex = i; });
// firstIndex === 4

This is especially useful for the "last index" case:

let lastIndex = -1;
"the l4st numb3r".replace(/\d/g,(p,i) => { lastIndex = i; });
// lastIndex === 13

Here, it's important to include the "g" modifier so that all occurrences are evaluated. These versions will also result in -1 if the regular expression was not found.

Finally, here are the more general functions which include a start index:

function indexOfRegex(str,regex,start = 0) {
    regex = regex.global ? regex : new RegExp(regex.source,regex.flags + "g");
    let index = -1;
    str.replace(regex,function() {
        const pos = arguments[arguments.length - 2];
        if(index < 0 && pos >= start)
            index = pos;
    });
    return index;
}

function lastIndexOfRegex(str,regex,start = str.length - 1) {
    regex = regex.global ? regex : new RegExp(regex.source,regex.flags + "g");
    let index = -1;
    str.replace(regex,function() {
        const pos = arguments[arguments.length - 2];
        if(pos <= start)
            index = pos;
    });
    return index;
}

These functions specifically avoid splitting the string at the start index which I feel is risky in the age of Unicode. They don't modify the prototype of common Javascript classes (although you're free to do so yourself). They accept more RegExp flags, for example "u" or "s" and any flags that may be added in the future. And I find it easier to reason about callback functions than for/while loops.

Oliver
  • 2,184
  • 3
  • 21
  • 24
1

In certain simple cases, you can simplify your backwards search by using split.

function regexlast(string,re){
  var tokens=string.split(re);
  return (tokens.length>1)?(string.length-tokens[tokens.length-1].length):null;
}

This has a few serious problems:

  1. overlapping matches won't show up
  2. returned index is for the end of the match rather than the beginning (fine if your regex is a constant)

But on the bright side it's way less code. For a constant-length regex that can't overlap (like /\s\w/ for finding word boundaries) this is good enough.

amwinter
  • 3,121
  • 2
  • 27
  • 25
1

The regexIndexOf from Jason Bunting can be inverted more simply and still support UTF8 characters by doing this:

function regexLastIndexOf(string, regex, startpos=0) {
    return text.length - regexIndexOf([...text].reverse().join(""), regex, startpos) - 1;
}
Tyler V.
  • 2,471
  • 21
  • 44
0

For data with sparse matches, using string.search is the fastest across browsers. It re-slices a string each iteration to :

function lastIndexOfSearch(string, regex, index) {
  if(index === 0 || index)
     string = string.slice(0, Math.max(0,index));
  var idx;
  var offset = -1;
  while ((idx = string.search(regex)) !== -1) {
    offset += idx + 1;
    string = string.slice(idx + 1);
  }
  return offset;
}

For dense data I made this. It's complex compared to the execute method, but for dense data, it's 2-10x faster than every other method I tried, and about 100x faster than the accepted solution. The main points are:

  1. It calls exec on the regex passed in once to verify there is a match or quit early. I do this using (?= in a similar method, but on IE checking with exec is dramatically faster.
  2. It constructs and caches a modified regex in the format '(r).(?!.?r)'
  3. The new regex is executed and the results from either that exec, or the first exec, are returned;

    function lastIndexOfGroupSimple(string, regex, index) {
        if (index === 0 || index) string = string.slice(0, Math.max(0, index + 1));
        regex.lastIndex = 0;
        var lastRegex, index
        flags = 'g' + (regex.multiline ? 'm' : '') + (regex.ignoreCase ? 'i' : ''),
        key = regex.source + '$' + flags,
        match = regex.exec(string);
        if (!match) return -1;
        if (lastIndexOfGroupSimple.cache === undefined) lastIndexOfGroupSimple.cache = {};
        lastRegex = lastIndexOfGroupSimple.cache[key];
        if (!lastRegex)
            lastIndexOfGroupSimple.cache[key] = lastRegex = new RegExp('.*(' + regex.source + ')(?!.*?' + regex.source + ')', flags);
        index = match.index;
        lastRegex.lastIndex = match.index;
        return (match = lastRegex.exec(string)) ? lastRegex.lastIndex - match[1].length : index;
    };
    

jsPerf of methods

I don't understand the purpose of the tests up top. Situations that require a regex are impossible to compare against a call to indexOf, which I think is the point of making the method in the first place. To get the test to pass, it makes more sense to use 'xxx+(?!x)', than adjust the way the regex iterates.

npjohns
  • 2,218
  • 1
  • 17
  • 16
0

Jason Bunting's last index of does not work. Mine is not optimal, but it works.

//Jason Bunting's
String.prototype.regexIndexOf = function(regex, startpos) {
var indexOf = this.substring(startpos || 0).search(regex);
return (indexOf >= 0) ? (indexOf + (startpos || 0)) : indexOf;
}

String.prototype.regexLastIndexOf = function(regex, startpos) {
var lastIndex = -1;
var index = this.regexIndexOf( regex );
startpos = startpos === undefined ? this.length : startpos;

while ( index >= 0 && index < startpos )
{
    lastIndex = index;
    index = this.regexIndexOf( regex, index + 1 );
}
return lastIndex;
}
Eli
  • 4,874
  • 6
  • 41
  • 50
  • Can you provide a test that causes mine to fail? If you found it doesn't work, provide a test case, why just say "it doesn't work" and provide a non-optimal solution in place? – Jason Bunting Nov 10 '15 at 20:37
  • 1
    Hoo boy. You're totally right. I should have provided an example. Unfortunately I moved on from this code months ago and have no idea what the fail case was. :-/ – Eli Nov 11 '15 at 22:00
  • well, such is life. :) – Jason Bunting Nov 17 '15 at 21:45
0

There are still no native methods that perform the requested task.

Here is the code that I am using. It mimics the behaviour of String.prototype.indexOf and String.prototype.lastIndexOf methods but they also accept a RegExp as the search argument in addition to a string representing the value to search for.

Yes it is quite long as an answer goes as it tries to follow current standards as close as possible and of course contains a reasonable amount of JSDOC comments. However, once minified, the code is only 2.27k and once gzipped for transmission it is only 1023 bytes.

The 2 methods that this adds to String.prototype (using Object.defineProperty where available) are:

  1. searchOf
  2. searchLastOf

It passes all the tests that the OP posted and additionally I have tested the routines quite thoroughly in my daily usage, and have attempted to be sure that they work across multiple environments, but feedback/issues are always welcome.

/*jslint maxlen:80, browser:true */

/*
 * Properties used by searchOf and searchLastOf implementation.
 */

/*property
    MAX_SAFE_INTEGER, abs, add, apply, call, configurable, defineProperty,
    enumerable, exec, floor, global, hasOwnProperty, ignoreCase, index,
    lastIndex, lastIndexOf, length, max, min, multiline, pow, prototype,
    remove, replace, searchLastOf, searchOf, source, toString, value, writable
*/

/*
 * Properties used in the testing of searchOf and searchLastOf implimentation.
 */

/*property
    appendChild, createTextNode, getElementById, indexOf, lastIndexOf, length,
    searchLastOf, searchOf, unshift
*/

(function () {
    'use strict';

    var MAX_SAFE_INTEGER = Number.MAX_SAFE_INTEGER || Math.pow(2, 53) - 1,
        getNativeFlags = new RegExp('\\/([a-z]*)$', 'i'),
        clipDups = new RegExp('([\\s\\S])(?=[\\s\\S]*\\1)', 'g'),
        pToString = Object.prototype.toString,
        pHasOwn = Object.prototype.hasOwnProperty,
        stringTagRegExp;

    /**
     * Defines a new property directly on an object, or modifies an existing
     * property on an object, and returns the object.
     *
     * @private
     * @function
     * @param {Object} object
     * @param {string} property
     * @param {Object} descriptor
     * @returns {Object}
     * @see https://goo.gl/CZnEqg
     */
    function $defineProperty(object, property, descriptor) {
        if (Object.defineProperty) {
            Object.defineProperty(object, property, descriptor);
        } else {
            object[property] = descriptor.value;
        }

        return object;
    }

    /**
     * Returns true if the operands are strictly equal with no type conversion.
     *
     * @private
     * @function
     * @param {*} a
     * @param {*} b
     * @returns {boolean}
     * @see http://www.ecma-international.org/ecma-262/5.1/#sec-11.9.4
     */
    function $strictEqual(a, b) {
        return a === b;
    }

    /**
     * Returns true if the operand inputArg is undefined.
     *
     * @private
     * @function
     * @param {*} inputArg
     * @returns {boolean}
     */
    function $isUndefined(inputArg) {
        return $strictEqual(typeof inputArg, 'undefined');
    }

    /**
     * Provides a string representation of the supplied object in the form
     * "[object type]", where type is the object type.
     *
     * @private
     * @function
     * @param {*} inputArg The object for which a class string represntation
     *                     is required.
     * @returns {string} A string value of the form "[object type]".
     * @see http://www.ecma-international.org/ecma-262/5.1/#sec-15.2.4.2
     */
    function $toStringTag(inputArg) {
        var val;
        if (inputArg === null) {
            val = '[object Null]';
        } else if ($isUndefined(inputArg)) {
            val = '[object Undefined]';
        } else {
            val = pToString.call(inputArg);
        }

        return val;
    }

    /**
     * The string tag representation of a RegExp object.
     *
     * @private
     * @type {string}
     */
    stringTagRegExp = $toStringTag(getNativeFlags);

    /**
     * Returns true if the operand inputArg is a RegExp.
     *
     * @private
     * @function
     * @param {*} inputArg
     * @returns {boolean}
     */
    function $isRegExp(inputArg) {
        return $toStringTag(inputArg) === stringTagRegExp &&
                pHasOwn.call(inputArg, 'ignoreCase') &&
                typeof inputArg.ignoreCase === 'boolean' &&
                pHasOwn.call(inputArg, 'global') &&
                typeof inputArg.global === 'boolean' &&
                pHasOwn.call(inputArg, 'multiline') &&
                typeof inputArg.multiline === 'boolean' &&
                pHasOwn.call(inputArg, 'source') &&
                typeof inputArg.source === 'string';
    }

    /**
     * The abstract operation throws an error if its argument is a value that
     * cannot be converted to an Object, otherwise returns the argument.
     *
     * @private
     * @function
     * @param {*} inputArg The object to be tested.
     * @throws {TypeError} If inputArg is null or undefined.
     * @returns {*} The inputArg if coercible.
     * @see https://goo.gl/5GcmVq
     */
    function $requireObjectCoercible(inputArg) {
        var errStr;

        if (inputArg === null || $isUndefined(inputArg)) {
            errStr = 'Cannot convert argument to object: ' + inputArg;
            throw new TypeError(errStr);
        }

        return inputArg;
    }

    /**
     * The abstract operation converts its argument to a value of type string
     *
     * @private
     * @function
     * @param {*} inputArg
     * @returns {string}
     * @see https://people.mozilla.org/~jorendorff/es6-draft.html#sec-tostring
     */
    function $toString(inputArg) {
        var type,
            val;

        if (inputArg === null) {
            val = 'null';
        } else {
            type = typeof inputArg;
            if (type === 'string') {
                val = inputArg;
            } else if (type === 'undefined') {
                val = type;
            } else {
                if (type === 'symbol') {
                    throw new TypeError('Cannot convert symbol to string');
                }

                val = String(inputArg);
            }
        }

        return val;
    }

    /**
     * Returns a string only if the arguments is coercible otherwise throws an
     * error.
     *
     * @private
     * @function
     * @param {*} inputArg
     * @throws {TypeError} If inputArg is null or undefined.
     * @returns {string}
     */
    function $onlyCoercibleToString(inputArg) {
        return $toString($requireObjectCoercible(inputArg));
    }

    /**
     * The function evaluates the passed value and converts it to an integer.
     *
     * @private
     * @function
     * @param {*} inputArg The object to be converted to an integer.
     * @returns {number} If the target value is NaN, null or undefined, 0 is
     *                   returned. If the target value is false, 0 is returned
     *                   and if true, 1 is returned.
     * @see http://www.ecma-international.org/ecma-262/5.1/#sec-9.4
     */
    function $toInteger(inputArg) {
        var number = +inputArg,
            val = 0;

        if ($strictEqual(number, number)) {
            if (!number || number === Infinity || number === -Infinity) {
                val = number;
            } else {
                val = (number > 0 || -1) * Math.floor(Math.abs(number));
            }
        }

        return val;
    }

    /**
     * Copies a regex object. Allows adding and removing native flags while
     * copying the regex.
     *
     * @private
     * @function
     * @param {RegExp} regex Regex to copy.
     * @param {Object} [options] Allows specifying native flags to add or
     *                           remove while copying the regex.
     * @returns {RegExp} Copy of the provided regex, possibly with modified
     *                   flags.
     */
    function $copyRegExp(regex, options) {
        var flags,
            opts,
            rx;

        if (options !== null && typeof options === 'object') {
            opts = options;
        } else {
            opts = {};
        }

        // Get native flags in use
        flags = getNativeFlags.exec($toString(regex))[1];
        flags = $onlyCoercibleToString(flags);
        if (opts.add) {
            flags += opts.add;
            flags = flags.replace(clipDups, '');
        }

        if (opts.remove) {
            // Would need to escape `options.remove` if this was public
            rx = new RegExp('[' + opts.remove + ']+', 'g');
            flags = flags.replace(rx, '');
        }

        return new RegExp(regex.source, flags);
    }

    /**
     * The abstract operation ToLength converts its argument to an integer
     * suitable for use as the length of an array-like object.
     *
     * @private
     * @function
     * @param {*} inputArg The object to be converted to a length.
     * @returns {number} If len <= +0 then +0 else if len is +INFINITY then
     *                   2^53-1 else min(len, 2^53-1).
     * @see https://people.mozilla.org/~jorendorff/es6-draft.html#sec-tolength
     */
    function $toLength(inputArg) {
        return Math.min(Math.max($toInteger(inputArg), 0), MAX_SAFE_INTEGER);
    }

    /**
     * Copies a regex object so that it is suitable for use with searchOf and
     * searchLastOf methods.
     *
     * @private
     * @function
     * @param {RegExp} regex Regex to copy.
     * @returns {RegExp}
     */
    function $toSearchRegExp(regex) {
        return $copyRegExp(regex, {
            add: 'g',
            remove: 'y'
        });
    }

    /**
     * Returns true if the operand inputArg is a member of one of the types
     * Undefined, Null, Boolean, Number, Symbol, or String.
     *
     * @private
     * @function
     * @param {*} inputArg
     * @returns {boolean}
     * @see https://goo.gl/W68ywJ
     * @see https://goo.gl/ev7881
     */
    function $isPrimitive(inputArg) {
        var type = typeof inputArg;

        return type === 'undefined' ||
                inputArg === null ||
                type === 'boolean' ||
                type === 'string' ||
                type === 'number' ||
                type === 'symbol';
    }

    /**
     * The abstract operation converts its argument to a value of type Object
     * but fixes some environment bugs.
     *
     * @private
     * @function
     * @param {*} inputArg The argument to be converted to an object.
     * @throws {TypeError} If inputArg is not coercible to an object.
     * @returns {Object} Value of inputArg as type Object.
     * @see http://www.ecma-international.org/ecma-262/5.1/#sec-9.9
     */
    function $toObject(inputArg) {
        var object;

        if ($isPrimitive($requireObjectCoercible(inputArg))) {
            object = Object(inputArg);
        } else {
            object = inputArg;
        }

        return object;
    }

    /**
     * Converts a single argument that is an array-like object or list (eg.
     * arguments, NodeList, DOMTokenList (used by classList), NamedNodeMap
     * (used by attributes property)) into a new Array() and returns it.
     * This is a partial implementation of the ES6 Array.from
     *
     * @private
     * @function
     * @param {Object} arrayLike
     * @returns {Array}
     */
    function $toArray(arrayLike) {
        var object = $toObject(arrayLike),
            length = $toLength(object.length),
            array = [],
            index = 0;

        array.length = length;
        while (index < length) {
            array[index] = object[index];
            index += 1;
        }

        return array;
    }

    if (!String.prototype.searchOf) {
        /**
         * This method returns the index within the calling String object of
         * the first occurrence of the specified value, starting the search at
         * fromIndex. Returns -1 if the value is not found.
         *
         * @function
         * @this {string}
         * @param {RegExp|string} regex A regular expression object or a String.
         *                              Anything else is implicitly converted to
         *                              a String.
         * @param {Number} [fromIndex] The location within the calling string
         *                             to start the search from. It can be any
         *                             integer. The default value is 0. If
         *                             fromIndex < 0 the entire string is
         *                             searched (same as passing 0). If
         *                             fromIndex >= str.length, the method will
         *                             return -1 unless searchValue is an empty
         *                             string in which case str.length is
         *                             returned.
         * @returns {Number} If successful, returns the index of the first
         *                   match of the regular expression inside the
         *                   string. Otherwise, it returns -1.
         */
        $defineProperty(String.prototype, 'searchOf', {
            enumerable: false,
            configurable: true,
            writable: true,
            value: function (regex) {
                var str = $onlyCoercibleToString(this),
                    args = $toArray(arguments),
                    result = -1,
                    fromIndex,
                    match,
                    rx;

                if (!$isRegExp(regex)) {
                    return String.prototype.indexOf.apply(str, args);
                }

                if ($toLength(args.length) > 1) {
                    fromIndex = +args[1];
                    if (fromIndex < 0) {
                        fromIndex = 0;
                    }
                } else {
                    fromIndex = 0;
                }

                if (fromIndex >= $toLength(str.length)) {
                    return result;
                }

                rx = $toSearchRegExp(regex);
                rx.lastIndex = fromIndex;
                match = rx.exec(str);
                if (match) {
                    result = +match.index;
                }

                return result;
            }
        });
    }

    if (!String.prototype.searchLastOf) {
        /**
         * This method returns the index within the calling String object of
         * the last occurrence of the specified value, or -1 if not found.
         * The calling string is searched backward, starting at fromIndex.
         *
         * @function
         * @this {string}
         * @param {RegExp|string} regex A regular expression object or a String.
         *                              Anything else is implicitly converted to
         *                              a String.
         * @param {Number} [fromIndex] Optional. The location within the
         *                             calling string to start the search at,
         *                             indexed from left to right. It can be
         *                             any integer. The default value is
         *                             str.length. If it is negative, it is
         *                             treated as 0. If fromIndex > str.length,
         *                             fromIndex is treated as str.length.
         * @returns {Number} If successful, returns the index of the first
         *                   match of the regular expression inside the
         *                   string. Otherwise, it returns -1.
         */
        $defineProperty(String.prototype, 'searchLastOf', {
            enumerable: false,
            configurable: true,
            writable: true,
            value: function (regex) {
                var str = $onlyCoercibleToString(this),
                    args = $toArray(arguments),
                    result = -1,
                    fromIndex,
                    length,
                    match,
                    pos,
                    rx;

                if (!$isRegExp(regex)) {
                    return String.prototype.lastIndexOf.apply(str, args);
                }

                length = $toLength(str.length);
                if (!$strictEqual(args[1], args[1])) {
                    fromIndex = length;
                } else {
                    if ($toLength(args.length) > 1) {
                        fromIndex = $toInteger(args[1]);
                    } else {
                        fromIndex = length - 1;
                    }
                }

                if (fromIndex >= 0) {
                    fromIndex = Math.min(fromIndex, length - 1);
                } else {
                    fromIndex = length - Math.abs(fromIndex);
                }

                pos = 0;
                rx = $toSearchRegExp(regex);
                while (pos <= fromIndex) {
                    rx.lastIndex = pos;
                    match = rx.exec(str);
                    if (!match) {
                        break;
                    }

                    pos = +match.index;
                    if (pos <= fromIndex) {
                        result = pos;
                    }

                    pos += 1;
                }

                return result;
            }
        });
    }
}());

(function () {
    'use strict';

    /*
     * testing as follow to make sure that at least for one character regexp,
     * the result is the same as if we used indexOf
     */

    var pre = document.getElementById('out');

    function log(result) {
        pre.appendChild(document.createTextNode(result + '\n'));
    }

    function test(str) {
        var i = str.length + 2,
            r,
            a,
            b;

        while (i) {
            a = str.indexOf('a', i);
            b = str.searchOf(/a/, i);
            r = ['Failed', 'searchOf', str, i, a, b];
            if (a === b) {
                r[0] = 'Passed';
            }

            log(r);
            a = str.lastIndexOf('a', i);
            b = str.searchLastOf(/a/, i);
            r = ['Failed', 'searchLastOf', str, i, a, b];
            if (a === b) {
                r[0] = 'Passed';
            }

            log(r);
            i -= 1;
        }
    }

    /*
     * Look for the a among the xes
     */

    test('xxx');
    test('axx');
    test('xax');
    test('xxa');
    test('axa');
    test('xaa');
    test('aax');
    test('aaa');
}());
<pre id="out"></pre>
Xotic750
  • 22,914
  • 8
  • 57
  • 79
0

If you are looking for a very simple lastIndex lookup with RegExp and don't care if it mimics lastIndexOf to the last detail, this may catch your attention.

I simply reverse the string, and subtract the first occurence index from length - 1. It happens to pass my test, but I think there could arise a performance issue with long strings.

interface String {
  reverse(): string;
  lastIndex(regex: RegExp): number;
}

String.prototype.reverse = function(this: string) {
  return this.split("")
    .reverse()
    .join("");
};

String.prototype.lastIndex = function(this: string, regex: RegExp) {
  const exec = regex.exec(this.reverse());
  return exec === null ? -1 : this.length - 1 - exec.index;
};
Armin Bu
  • 1,330
  • 9
  • 17
0

I used String.prototype.match(regex) which returns a string array of all found matches of the given regex in the string (more info see here):

function getLastIndex(text, regex, limit = text.length) {
  const matches = text.match(regex);

  // no matches found
  if (!matches) {
    return -1;
  }

  // matches found but first index greater than limit
  if (text.indexOf(matches[0] + matches[0].length) > limit) {
    return -1;
  }

  // reduce index until smaller than limit
  let i = matches.length - 1;
  let index = text.lastIndexOf(matches[i]);
  while (index > limit && i >= 0) {
    i--;
    index = text.lastIndexOf(matches[i]);
  }
  return index > limit ? -1 : index;
}

// expect -1 as first index === 14
console.log(getLastIndex('First Sentence. Last Sentence. Unfinished', /\. /g, 10));

// expect 29
console.log(getLastIndex('First Sentence. Last Sentence. Unfinished', /\. /g));
wfreude
  • 492
  • 3
  • 10
0
var mystring = "abc ab a";
var re  = new RegExp("ab"); // any regex here

if ( re.exec(mystring) != null ){ 
   alert("matches"); // true in this case
}

Use standard regular expressions:

var re  = new RegExp("^ab");  // At front
var re  = new RegExp("ab$");  // At end
var re  = new RegExp("ab(c|d)");  // abc or abd
user984003
  • 28,050
  • 64
  • 189
  • 285
0

let regExp; // your RegExp here
arr.map(x => !!x.toString().match(regExp)).indexOf(true)
  • 1
    Please add further details to expand on your answer, such as working code or documentation citations. – Community Sep 09 '21 at 03:52
0

You can use String.prototype.matchAll(), together with the convenient Array.prototype.at():

const str = "foo a foo B";
const matches = [...str.matchAll(/[abc]/gi)];

if (matches.length) {
  const indexOfFirstMatch = matches.at(0).index;
  const indexOfLastMatch = matches.at(-1).index;
  console.log(indexOfFirstMatch, indexOfLastMatch)
}
ruohola
  • 21,987
  • 6
  • 62
  • 97
-2

Well, as you are just looking to match the position of a character , regex is possibly overkill.

I presume all you want is, instead of "find first of these this character" , just find first of these characters.

This of course is the simple answer, but does what your question sets out to do, albeit without the regex part ( because you didn't clarify why specifically it had to be a regex )

function mIndexOf( str , chars, offset )
{
   var first  = -1; 
   for( var i = 0; i < chars.length;  i++ )
   {
      var p = str.indexOf( chars[i] , offset ); 
      if( p < first || first === -1 )
      {
           first = p;
      }
   }
   return first; 
}
String.prototype.mIndexOf = function( chars, offset )
{
   return mIndexOf( this, chars, offset ); # I'm really averse to monkey patching.  
};
mIndexOf( "hello world", ['a','o','w'], 0 );
>> 4 
mIndexOf( "hello world", ['a'], 0 );
>> -1 
mIndexOf( "hello world", ['a','o','w'], 4 );
>> 4
mIndexOf( "hello world", ['a','o','w'], 5 );
>> 6
mIndexOf( "hello world", ['a','o','w'], 7 );
>> -1 
mIndexOf( "hello world", ['a','o','w','d'], 7 );
>> 10
mIndexOf( "hello world", ['a','o','w','d'], 10 );
>> 10
mIndexOf( "hello world", ['a','o','w','d'], 11 );
>> -1
Kent Fredric
  • 56,416
  • 14
  • 107
  • 150
  • Just a comment about monkey patching - while I'm aware of its problems - you think polluting the global namespace is better? It's not like symbol conflicts in BOTH cases can't happen, and are basically refactored/repaired in the same way should a problem arise. – Peter Bailey Nov 07 '08 at 23:56
  • Well I need to search for \s and in some cases \W and was hoping I didn't have to enumerate all possibilities. – Pat Nov 08 '08 at 00:00
  • BaileyP: you can get around this problem without global namespace pollution, ie: see jQuery for example. use that model. one object for project, your stuff goes inside it. Mootools left a bad taste in my mouth. – Kent Fredric Nov 08 '08 at 06:33
  • also to be noted i never code like i wrote there. the example was simplified for use-case reasons. – Kent Fredric Nov 08 '08 at 06:36