13

I am trying to count the number of words in a given string using the following code:

var t = document.getElementById('MSO_ContentTable').textContent;

if (t == undefined) {
  var total = document.getElementById('MSO_ContentTable').innerText;                
} else {
  var total = document.getElementById('MSO_ContentTable').textContent;        
}
countTotal = cword(total);   

function cword(w) {
  var count = 0;
  var words = w.split(" ");
  for (i = 0; i < words.length; i++) {
    // inner loop -- do the count
    if (words[i] != "") {
      count += 1;
    }
  }

  return (count);
}

In that code I am getting data from a div tag and sending it to the cword() function for counting. Though the return value is different in IE and Firefox. Is there any change required in the regular expression? One thing that I show that both browser send same string there is a problem inside the cword() function.

Sebastian Zartner
  • 18,808
  • 10
  • 90
  • 132
V_B
  • 1,571
  • 5
  • 16
  • 37
  • I'm not sure what your question is, but what is the check against an empty string for? – kinakuta Jul 01 '11 at 05:33
  • in short i call function cword() with some string paragraph as an argument.but the return value is different in ff and ie – V_B Jul 01 '11 at 05:36
  • What's an example string that's giving you different results between browsers? – kinakuta Jul 01 '11 at 05:38
  • when i use that much of string its give me currect result but some changes some enter ans spaces change result this below is my string Welcome to your wiki library! You can get started and add content to this page by clicking Edit at the top of this page, or you can learn more about wiki libraries by clicking [[How To Use This Library]]. What is a wiki library? Wikiwiki means quick in Hawaiian. A wiki library is a document library in which users can easily edit any page. – V_B Jul 01 '11 at 05:52

5 Answers5

22

[edit 2022, based on comment] Nowadays, one would not extend the native prototype this way. A way to extend the native protype without the danger of naming conflicts is to use the es20xx symbol. Here is an example of a wordcounter using that.

Old answer: you can use split and add a wordcounter to the String prototype:

if (!String.prototype.countWords) {
  String.prototype.countWords = function() {
    return this.length && this.split(/\s+\b/).length || 0;
  };
}

console.log(`'this string has five words'.countWords() => ${
  'this string has five words'.countWords()}`);
console.log(`'this string has five words ... and counting'.countWords() => ${
  'this string has five words ... and counting'.countWords()}`);
console.log(`''.countWords() => ${''.countWords()}`);
KooiInc
  • 119,216
  • 31
  • 141
  • 177
14

I would prefer a RegEx only solution:

var str = "your long string with many words.";
var wordCount = str.match(/(\w+)/g).length;
alert(wordCount); //6

The regex is

\w+    between one and unlimited word characters
/g     greedy - don't stop after the first match

The brackets create a group around every match. So the length of all matched groups should match the word count.

DanielH
  • 172
  • 1
  • 10
10

This is the best solution I've found:

function wordCount(str) { var m = str.match(/[^\s]+/g) return m ? m.length : 0; }

This inverts whitespace selection, which is better than \w+ because it only matches the latin alphabet and _ (see http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.2.6)

If you're not careful with whitespace matching you'll count empty strings, strings with leading and trailing whitespace, and all whitespace strings as matches while this solution handles strings like ' ', ' a\t\t!\r\n#$%() d ' correctly (if you define 'correct' as 0 and 4).

aaron
  • 1,746
  • 1
  • 13
  • 24
  • 1
    This answer is underrated IMO. – Dizzley Jan 05 '20 at 18:59
  • yeah, classic stackoverflow where the actual right answer is hidden by two incomplete answers with more votes and one clearly wrong answer that is accepted as correct – aaron Mar 02 '21 at 12:16
3

You can make a clever use of the replace() method although you are not replacing anything.

var str = "the very long text you have...";

var counter = 0;

// lets loop through the string and count the words
str.replace(/(\b+)/g,function (a) {
   // for each word found increase the counter value by 1
   counter++;
})

alert(counter);

the regex can be improved to exclude html tags for example

Ibu
  • 42,752
  • 13
  • 76
  • 103
0
//Count words in a string or what appears as words :-)

        function countWordsString(string){

            var counter = 1;

            // Change multiple spaces for one space
            string=string.replace(/[\s]+/gim, ' ');

            // Lets loop through the string and count the words
            string.replace(/(\s+)/g, function (a) {
               // For each word found increase the counter value by 1
               counter++;
            });

            return counter;
        }


        var numberWords = countWordsString(string);
Roman
  • 19,236
  • 15
  • 93
  • 97