130

I wanted to write a regex to count the number of spaces/tabs/newline in a chunk of text. So I naively wrote the following:-

numSpaces : function(text) { 
    return text.match(/\s/).length; 
}

For some unknown reasons it always returns 1. What is the problem with the above statement? I have since solved the problem with the following:-

numSpaces : function(text) { 
    return (text.split(/\s/).length -1); 
}
Max
  • 6,821
  • 3
  • 43
  • 59
wai
  • 8,923
  • 4
  • 24
  • 19

9 Answers9

250

tl;dr: Generic Pattern Counter

// THIS IS WHAT YOU NEED
const count = (str) => {
  const re = /YOUR_PATTERN_HERE/g
  return ((str || '').match(re) || []).length
}

For those that arrived here looking for a generic way to count the number of occurrences of a regex pattern in a string, and don't want it to fail if there are zero occurrences, this code is what you need. Here's a demonstration:

/*
 *  Example
 */

const count = (str) => {
  const re = /[a-z]{3}/g
  return ((str || '').match(re) || []).length
}

const str1 = 'abc, def, ghi'
const str2 = 'ABC, DEF, GHI'

console.log(`'${str1}' has ${count(str1)} occurrences of pattern '/[a-z]{3}/g'`)
console.log(`'${str2}' has ${count(str2)} occurrences of pattern '/[a-z]{3}/g'`)

Original Answer

The problem with your initial code is that you are missing the global identifier:

>>> 'hi there how are you'.match(/\s/g).length;
4

Without the g part of the regex it will only match the first occurrence and stop there.

Also note that your regex will count successive spaces twice:

>>> 'hi  there'.match(/\s/g).length;
2

If that is not desirable, you could do this:

>>> 'hi  there'.match(/\s+/g).length;
1
Trevor
  • 13,085
  • 13
  • 76
  • 99
Paolo Bergantino
  • 480,997
  • 81
  • 517
  • 436
  • 5
    This works as long as you have at least one space in your input. Otherwise, match() annoyingly returns null. – sfink Apr 28 '11 at 23:46
  • 3
    sfink is right, you definitely want to check if match() returned null: `var result = text.match(/\s/g); return result ? result.length : 0;` – Gras Double Sep 11 '11 at 12:52
  • 39
    You can also protect against the null by using this construction: `( str.match(...) || [] ).length` – a'r Nov 03 '11 at 17:10
  • What's the problem with `('string'.match(/\s/g) || []).length` ? – João Pimentel Ferreira Jan 07 '21 at 14:04
  • @JoãoPimentelFerreira The difference is: if `str` is null, then `str.match()` will fail but `(str || '').match()` will not. – jkdev Jan 05 '22 at 23:37
11

As mentioned in my earlier answer, you can use RegExp.exec() to iterate over all matches and count each occurrence; the advantage is limited to memory only, because on the whole it's about 20% slower than using String.match().

var re = /\s/g,
count = 0;

while (re.exec(text) !== null) {
    ++count;
}

return count;
Community
  • 1
  • 1
Ja͢ck
  • 170,779
  • 38
  • 263
  • 309
11
(('a a a').match(/b/g) || []).length; // 0
(('a a a').match(/a/g) || []).length; // 3

Based on https://stackoverflow.com/a/48195124/16777 but fixed to actually work in zero-results case.

Kev
  • 15,899
  • 15
  • 79
  • 112
5

Here is a similar solution to @Paolo Bergantino's answer, but with modern operators. I'll explain below.

    const matchCount = (str, re) => {
      return str?.match(re)?.length ?? 0;
    };

    // usage
    
    let numSpaces = matchCount(undefined, /\s/g);
    console.log(numSpaces); // 0
    numSpaces = matchCount("foobarbaz", /\s/g);
    console.log(numSpaces); // 0
    numSpaces = matchCount("foo bar baz", /\s/g);
    console.log(numSpaces); // 2

?. is the optional chaining operator. It allows you to chain calls as deep as you want without having to worry about whether there is an undefined/null along the way. Think of str?.match(re) as

if (str !== undefined && str !== null) {
    return str.match(re);
} else {
    return undefined;
}

This is slightly different from @Paolo Bergantino's. Theirs is written like this: (str || ''). That means if str is falsy, return ''. 0 is falsy. document.all is falsy. In my opinion, if someone were to pass those into this function as a string, it would probably be because of programmer error. Therefore, I'd rather be informed I'm doing something non-sensible than troubleshoot why I keep on getting a length of 0.

?? is the nullish coalescing operator. Think of it as || but more specific. If the left hand side of || evaluates to falsy, it executes the right-hand side. But ?? only executes if the left-hand side is undefined or null.

Keep in mind, the nullish coalescing operator in ?.length ?? 0 will return the same thing as using ?.length || 0. The difference is, if length returns 0, it won't execute the right-hand side... but the result is going to be 0 whether you use || or ??.

Honestly, in this situation I would probably change it to || because more JavaScript developers are familiar with that operator. Maybe someone could enlighten me on benefits of ?? vs || in this situation, if any exist.

Lastly, I changed the signature so the function can be used for any regex.

Oh, and here is a typescript version:

    const matchCount = (str: string, re: RegExp) => {
      return str?.match(re)?.length ?? 0;
    };
Daniel Kaplan
  • 62,768
  • 50
  • 234
  • 356
3

('my string'.match(/\s/g) || []).length;

Weston Ganger
  • 6,324
  • 4
  • 41
  • 39
  • 1
    I think you put the `|| []` in the wrong place, it should be `('my string'.match(/\s/g) || []).length` – woojoo666 Apr 09 '19 at 07:55
2

This is certainly something that has a lot of traps. I was working with Paolo Bergantino's answer, and realising that even that has some limitations. I found working with string representations of dates a good place to quickly find some of the main problems. Start with an input string like this: '12-2-2019 5:1:48.670'

and set up Paolo's function like this:

function count(re, str) {
    if (typeof re !== "string") {
        return 0;
    }
    re = (re === '.') ? ('\\' + re) : re;
    var cre = new RegExp(re, 'g');
    return ((str || '').match(cre) || []).length;
}

I wanted the regular expression to be passed in, so that the function is more reusable, secondly, I wanted the parameter to be a string, so that the client doesn't have to make the regex, but simply match on the string, like a standard string utility class method.

Now, here you can see that I'm dealing with issues with the input. With the following:

if (typeof re !== "string") {
    return 0;
}

I am ensuring that the input isn't anything like the literal 0, false, undefined, or null, none of which are strings. Since these literals are not in the input string, there should be no matches, but it should match '0', which is a string.

With the following:

re = (re === '.') ? ('\\' + re) : re;

I am dealing with the fact that the RegExp constructor will (I think, wrongly) interpret the string '.' as the all character matcher \.\

Finally, because I am using the RegExp constructor, I need to give it the global 'g' flag so that it counts all matches, not just the first one, similar to the suggestions in other posts.

I realise that this is an extremely late answer, but it might be helpful to someone stumbling along here. BTW here's the TypeScript version:

function count(re: string, str: string): number {
    if (typeof re !== 'string') {
        return 0;
    }
    re = (re === '.') ? ('\\' + re) : re;
    const cre = new RegExp(re, 'g');    
    return ((str || '').match(cre) || []).length;
}
Michael Coxon
  • 3,337
  • 8
  • 46
  • 68
2

This seems well solved but I hadn't seen this version, which might be a bit more readable, and in-keeping with the style in some codebases:

const numberOfResults = [...str.matchAll(/YOUR_REGEX/g)].length;
Will
  • 21
  • 1
0

Using modern syntax avoids the need to create a dummy array to count length 0

const countMatches = (exp, str) => str.match(exp)?.length ?? 0;

Must pass exp as RegExp and str as String.

Blindman67
  • 51,134
  • 11
  • 73
  • 136
-2

how about like this

function isint(str){
    if(str.match(/\d/g).length==str.length){
        return true;
    }
    else {
         return false
    }
}
Robert
  • 5,278
  • 43
  • 65
  • 115
anders
  • 1