10

I would like to know how to match a string against an array of regular expressions.
I know how to do this looping through the array.
I also know how to do this by making a long regular expression separated by |
I was hoping for a more efficient way like

if (string contains one of the values in array) {

For example:

string = "the word tree is in this sentence";  
array[0] = "dog";  
array[1] = "cat";  
array[2] = "bird";  
array[3] = "birds can fly";  

In the above example, the condition would be false.
However, string = "She told me birds can fly and I agreed" would return true.

ROMANIA_engineer
  • 54,432
  • 29
  • 203
  • 199
Xi Vix
  • 1,381
  • 6
  • 24
  • 43
  • The reason I would like to use an array instead of | is the array may get large with hundreds of entries – Xi Vix Jun 02 '11 at 20:27
  • What about the sentence “The caterpillar …” that contains “cat” as part of another word? – Gumbo Jun 02 '11 at 20:30
  • That's fine ... I can tweak the regular expressions with word boundaries if necessary – Xi Vix Jun 02 '11 at 20:31
  • @xivix: So you’re looking for strict matches, right? Because in that case there is a more efficient approach. – Gumbo Jun 02 '11 at 20:36
  • Not sure what you mean by strict matches. The example was simplified however I will be using the various regular expression functionality to match a variety of comparison strings with the input string. For example, if I wanted to isolate cat I would use \bcat\b – Xi Vix Jun 02 '11 at 20:41
  • @xivix: I thought you were trying to match against whole words. – Gumbo Jun 02 '11 at 20:45
  • I don't see how any solution whould not involve looping through the whole array – Liviu T. Jun 02 '11 at 20:45
  • Is it more efficient to loop through the array or put the whole array in one long regular expression separated by bars? – Xi Vix Jun 02 '11 at 20:54

4 Answers4

22

How about creating a regular expression on the fly when you need it (assuming the array changes over time)

if( (new RegExp( '\\b' + array.join('\\b|\\b') + '\\b') ).test(string) ) {
  alert('match');
}

demo:

string = "the word tree is in this sentence"; 
var array = [];
array[0] = "dog";  
array[1] = "cat";  
array[2] = "bird";  
array[3] = "birds can fly";  

if( (new RegExp( '\\b' + array.join('\\b|\\b') + '\\b') ).test(string) ){
    alert('match');
}
else{
    alert('no match');
}

For browsers that support javascript version 1.6 you can use the some() method

if ( array.some(function(item){return (new RegExp('\\b'+item+'\\b')).test(string);}) ) {
 alert('match');
}

demo:

string = "the word tree is in this sentence"; 
var array = [];
array[0] = "dog";  
array[1] = "tree";  
array[2] = "bird";  
array[3] = "birds can fly";  

if ( array.some(function(i){return (new RegExp('\\b'+i+'\\b')).test(string);}) ) {
 alert('match');
}
Erfan Bahramali
  • 392
  • 3
  • 13
Gabriele Petrioli
  • 191,379
  • 34
  • 261
  • 317
  • Well, that's the bar method I was mentioning above. I know how to set up a long regular expression with the bars ... but it will end up being a VERY long regular expression and I was hoping for a more elegant / efficient solution. Was just wondering if Javascript had any built-in functionality which addressed this. – Xi Vix Jun 02 '11 at 20:50
  • If there is no elegant way to do this, then is it better to loop through the array and run a simple regular expression x times? Or to put all of the regular expressions in one long complex regular expression? – Xi Vix Jun 02 '11 at 20:53
  • This worked. Thanks :) It appears to be more efficient than the method I was using. – Xi Vix Jun 02 '11 at 21:15
  • 1
    Just wanted to add that my package, http://npm.im/regexr, might help make the regex manipulation easier (escaping is not needed). – trusktr Apr 04 '16 at 21:06
5

(Many years later)

My version of @Gaby's answer, as I needed a way to check CORS origin against regular expressions in an array:

var corsWhitelist = [/^(?:.+\.)?domain\.com/, /^(?:.+\.)?otherdomain\.com/];

var corsCheck = function(origin, callback) {
  if (corsWhitelist.some(function(item) {
    return (new RegExp(item).test(origin));
  })) {
    callback(null, true);
  } 
  else {
    callback(null, false);
  }
}

corsCheck('otherdomain.com', function(err, result) {
  console.log('CORS match for otherdomain.com: ' + result);  
});

corsCheck('forbiddendomain.com', function(err, result) {
  console.log('CORS match for forbiddendomain.com: ' + result);  
});
Ville
  • 4,088
  • 2
  • 37
  • 38
0

Is that ok ?

function checkForMatch(string,array){
    var arrKeys = array.length;
    var match = false;
    var patt;
    for(i=0; i < arrKeys; i++ ){
        patt=new RegExp(" "+array[i]+" ");
        if(patt.test(string))
           match = true;
    }
    return match;
}

string = "She told me birds can fly and I agreed"; 

var array = new Array();
array[0] = "dog";  
array[1] = "cat";  
array[2] = "bird";  
array[3] = "birds can fly";


alert(checkForMatch(string, array));
T1000
  • 2,909
  • 7
  • 36
  • 53
  • Well, thank you but I already knew how to loop through the array, which is what I am doing now. I guess the SOME function mentioned by Gaby above is the closest I'm going to get to what I'm looking for. – Xi Vix Jun 02 '11 at 21:08
0

If you have the literal strings in an array called strings you want to match, you can combine them into an alternation by doing

new RegExp(strings.map(
    function (x) {  // Escape special characters like '|' and '$'.
      return x.replace(/[^a-zA-Z]/g, "\\$&");
    }).join("|"))

If you don't have only literal strings, you want to combine regular expressions, then you use http://code.google.com/p/google-code-prettify/source/browse/trunk/js-modules/combinePrefixPatterns.js

/**
 * Given a group of {@link RegExp}s, returns a {@code RegExp} that globally
 * matches the union of the sets of strings matched by the input RegExp.
 * Since it matches globally, if the input strings have a start-of-input
 * anchor (/^.../), it is ignored for the purposes of unioning.
 * @param {Array.<RegExp>} regexs non multiline, non-global regexs.
 * @return {RegExp} a global regex.
 */
Mike Samuel
  • 118,113
  • 30
  • 216
  • 245