1

I'm trying to write a program where the user gives the system a word, and a paragraph, the system's job is to count how many times that word pops up.

How can I count how many times the word pops up in C#?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Mrsoldier3201
  • 23
  • 1
  • 6
  • possible duplicate of [How would you count occurrences of a string within a string?](http://stackoverflow.com/questions/541954/how-would-you-count-occurrences-of-a-string-within-a-string) – mbdavis Feb 14 '15 at 20:13
  • It's not a duplicate of that question - this is for counting a word in an [English] text paragraph (and not a *single character*) which has different implications and solutions. – user2864740 Feb 14 '15 at 21:06

3 Answers3

4

Using regular expression with Word Boundary anchor:

int wordCount = Regex.Matches(text, "\\b" + Regex.Escape(searchTerm) + "\\b", RegexOptions.IgnoreCase).Count;
Stipo
  • 4,566
  • 1
  • 21
  • 37
1

https://msdn.microsoft.com/en-us/library/bb546166.aspx

as the article says " There is a performance cost to the Split method. If the only operation on the string is to count the words, you should consider using the Matches or IndexOf methods instead"

So you could use a while loop with indexOf and count if performance is an issue.

class CountWords
{
    static void Main()
    {
        string text = @"Historically, the world of data and the world of objects" +
          @" have not been well integrated. Programmers work in C# or Visual Basic" +
          @" and also in SQL or XQuery. On the one side are concepts such as classes," +
          @" objects, fields, inheritance, and .NET Framework APIs. On the other side" +
          @" are tables, columns, rows, nodes, and separate languages for dealing with" +
          @" them. Data types often require translation between the two worlds; there are" +
          @" different standard functions. Because the object world has no notion of query, a" +
          @" query can only be represented as a string without compile-time type checking or" +
          @" IntelliSense support in the IDE. Transferring data from SQL tables or XML trees to" +
          @" objects in memory is often tedious and error-prone.";

        string searchTerm = "data";

        //Convert the string into an array of words 
        string[] source = text.Split(new char[] { '.', '?', '!', ' ', ';', ':', ',' }, StringSplitOptions.RemoveEmptyEntries);

        // Create the query.  Use ToLowerInvariant to match "data" and "Data"  
        var matchQuery = from word in source
                         where word.ToLowerInvariant() == searchTerm.ToLowerInvariant()
                         select word;

        // Count the matches, which executes the query. 
        int wordCount = matchQuery.Count();
        Console.WriteLine("{0} occurrences(s) of the search term \"{1}\" were found.", wordCount, searchTerm);

        // Keep console window open in debug mode
        Console.WriteLine("Press any key to exit");
        Console.ReadKey();
    }
}
/* Output:
   3 occurrences(s) of the search term "data" were found.
*/
prospector
  • 3,389
  • 1
  • 23
  • 40
  • The example you gives uses the Split() method... even if it says it's not the better one, and gives a link to `Regex` `Matches()` method instead... https://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.matches.aspx – rducom Feb 14 '15 at 20:23
  • The split method is just wrong as it fails to match words in contexts such as `(word)`. The index of is wrong because it will match `notaword`. – user2864740 Feb 14 '15 at 21:05
  • @user2864740 I don't know how you came to that conclusion about indexof. – prospector Feb 14 '15 at 23:18
  • @prospector Using a simple indexOf, as shown in a deleted answer (and hinted in this answer without even an implementation), does not correctly account for word breaks. So my "conclusion" is based around an equally inadequate implied usage of indexOf. – user2864740 Feb 15 '15 at 18:19
0
String test = "the full full :full? text !!! ";
String search = "full";
int count = String.Concat(test.Select(i => Char.IsPunctuation(i) ? ' ' : i))
                  .Split(' ').Where(i => i == search).Count();

This will:

  • Replace every punctuation character into a space, by checking each character (test.select), and will put them together again as another string (String.Concat)
  • Split the string into substrings delimited by spaces (.Split)
  • Filter out to keep only the ones that match the search string
  • Count them Count()
Alpha
  • 7,586
  • 8
  • 59
  • 92
rducom
  • 7,072
  • 1
  • 25
  • 39