14

I want to count words and spaces in my string. String looks like this:

Command do something ptuf(123) and bo(1).ctq[5] v:0,

I have something like this so far

int count = 0;
string mystring = "Command do something ptuf(123) and bo(1).ctq[5] v:0,";
foreach(char c in mystring) 
{
if(char.IsLetter(c)) 
  {
     count++;
  }
}

What should I do to count spaces also?

Michal_LFC
  • 649
  • 3
  • 11
  • 25

10 Answers10

38
int countSpaces = mystring.Count(Char.IsWhiteSpace); // 6
int countWords = mystring.Split().Length; // 7

Note that both use Char.IsWhiteSpace which assumes other characters than " " as white-space(like newline). Have a look at the remarks section to see which exactly .

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • Lol....if I just understood linq at the time I wrote those methods of mine.... By the way, does it eliminate the words when there's two conescutive spaces? – Daniel Möller Jul 23 '13 at 14:15
  • @Daniel: No, that would include "ghost" words if two concurrent spaces existed. – Brad Christie Jul 23 '13 at 14:15
  • But doesn't that split need a " " as parameter? – Daniel Möller Jul 23 '13 at 14:19
  • 1
    @Daniel: No, if you don't provide splitting characters white-spaces are assumed. – Tim Schmelter Jul 23 '13 at 14:25
  • The only issue with this solution is if there is an instance where there are multiple whitespaces in a row (not sure if this is really a problem or not in your situation) In this case, I believe it would affect countWords. – user2366842 Jul 23 '13 at 15:11
  • 3
    @user2366842: you could use `mystring.Split().Count(word => !String.IsNullOrEmpty(word));` or the `Split` overload with `StringSplitOptions.RemoveEmptyEntries`. The problem with the latter is that you cannot use the implicit `Char.IsWhiteSpace` test anymore which also splits by other characters like `Environment.NewLine` or tab. That's why i would prefer the first approach even if it's not as effcient. – Tim Schmelter Jul 23 '13 at 15:20
  • 3
    You can still get the `char.IsWhiteSpace` behavior when you use the overload that takes `StringSplitOptions` if you explicitly pass either `null` or an empty array, like `mystring.Split(new char[0], StringSplitOptions.RemoveEmptyEntries).Length;` – Quartermeister Jul 23 '13 at 19:24
2

you can use string.Split with a space http://msdn.microsoft.com/en-us/library/system.string.split.aspx

When you get a string array the number of elements is the number of words, and the number of spaces is the number of words -1

asafrob
  • 1,838
  • 13
  • 16
2

if you want to count spaces you can use LINQ :

int count = mystring.Count(s => s == ' ');
Jonesopolis
  • 25,034
  • 12
  • 68
  • 112
1

Here's a method using regex. Just something else to consider. It is better if you have long strings with lots of different types of whitespace. Similar to Microsoft Word's WordCount.

var str = "Command do something ptuf(123) and bo(1).ctq[5] v:0,";
int count = Regex.Matches(str, @"[\S]+").Count; // count is 7

For comparison,

var str = "Command     do    something     ptuf(123) and bo(1).ctq[5] v:0,";

str.Count(char.IsWhiteSpace) is 17, while the regex count is still 7.

Gray
  • 7,050
  • 2
  • 29
  • 52
  • _Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems._ – Brad Christie Jul 23 '13 at 14:18
  • @BradChristie I feel like you are misusing that quote. I am not writing some kind of parser here. It just matches non-whitespace text, and then you get the count. It doesn't count consecutive spaces and you don't need to clean up the collection when you are done. The expression itself is about as basic as you can get. It isn't difficult to understand or maintain. I don't like feeling like I am doing something the wrong way, so could you explain why you think that quote applies to this solution? – Gray Jul 23 '13 at 14:40
  • Regex is useful in certain scenarios, as long as it's kept simple (this one is) and is easily maintainable (seems to be) it should be fine. It's when you start overkilling it and making a 15 line long regex to filter out your email addresses, is when you start having issues. – user2366842 Jul 23 '13 at 15:04
  • I guess I was just implying that regex is overkill for "simple" string manipulation. There's no real pattern processing here, complex interrogation or elaborate algorithm. The string libraries already have implementations that take care of this. With that said, does it make sense to use `" foo ".Trim()` or `Regex.Replace(" foo ", @"^\s+|\s+$", String.Empty)`? By your definition, the latter is simple and easy to maintain, but it (imho) complete overkill given `Trim` exists. – Brad Christie Jul 23 '13 at 15:05
  • @BradChristie I understand that trim() is clearly easier to understand/faster/etc there, but consider the fact that the selected answer doesn't even account for spaces back to back, and your answer doesn't account for any whitespace except spaces. It may be that this application only requires what you posted or what the selected answer does, but the regex solution seems more robust to me - not perfect, but closer than either. – Gray Jul 23 '13 at 15:22
  • Can you point out where the OP explicitly listed the requirement as whitespace? Because I can't. Everyone else took the leap of faith that the OP implied it, but I never see him/her use any other term than "space" in the entire question. Also, I can easily modify the `char[]` parameter to accept the multitude of other acceptable whitespace characters, but I don't see the need given the question. – Brad Christie Jul 23 '13 at 15:27
  • If I asked how to make "the quick brown fox" Uppercase, are you going to just provider `ToUpper` or would you give me an elaborate library that handled unicode characters as well (on the implied premise I needed it)? – Brad Christie Jul 23 '13 at 15:29
  • I'll concede that the op did not ask for that. You know better than me that sometimes people ask for the wrong thing, and you sometimes have to guide them in a different direction. I know this isn't meant for long discussions. I said what I needed to say, and it was that I do not think my use of regex was as egregious as you implied. Thanks for your opinion - I did not mean to imply your answer was incorrect or anything like that, just an explanation for the (seemingly) snarky criticism of my answer. I understand your point, and I think you know what I am saying. – Gray Jul 23 '13 at 15:32
1

This will take into account:

  • Strings starting or ending with a space.
  • Double/triple/... spaces.

Assuming that the only word seperators are spaces and that your string is not null.

private static int CountWords(string S)
{
    if (S.Length == 0)
        return 0;

    S = S.Trim();
    while (S.Contains("  "))
        S = S.Replace("  "," ");
    return S.Split(' ').Length;
}

Note: the while loop can also be done with a regex: How do I replace multiple spaces with a single space in C#?

Community
  • 1
  • 1
TTT
  • 1,848
  • 2
  • 30
  • 60
  • 1
    You know `.Split` has an overload which accepts `StringSplitOptions`, which can specify `RemoveEmptyEntries`; the `.Trim()` & `.Replace` are superfluous. – Brad Christie Jul 23 '13 at 15:10
0

I've got some ready code to get a list of words in a string: (extension methods, must be in a static class)

    /// <summary>
    /// Gets a list of words in the text. A word is any string sequence between two separators.
    /// No word is added if separators are consecutive (would mean zero length words).
    /// </summary>
    public static List<string> GetWords(this string Text, char WordSeparator)
    {
        List<int> SeparatorIndices = Text.IndicesOf(WordSeparator.ToString(), true);

        int LastIndexNext = 0;


        List<string> Result = new List<string>();
        foreach (int index in SeparatorIndices)
        {
            int WordLen = index - LastIndexNext;
            if (WordLen > 0)
            {
                Result.Add(Text.Substring(LastIndexNext, WordLen));
            }
            LastIndexNext = index + 1;
        }

        return Result;
    }

    /// <summary>
    /// returns all indices of the occurrences of a passed string in this string.
    /// </summary>
    public static List<int> IndicesOf(this string Text, string ToFind, bool IgnoreCase)
    {
        int Index = -1;
        List<int> Result = new List<int>();

        string T, F;

        if (IgnoreCase)
        {
            T = Text.ToUpperInvariant();
            F = ToFind.ToUpperInvariant();
        }
        else
        {
            T = Text;
            F = ToFind;
        }


        do
        {
            Index = T.IndexOf(F, Index + 1);
            Result.Add(Index);
        }
        while (Index != -1);

        Result.RemoveAt(Result.Count - 1);

        return Result;
    }


    /// <summary>
    /// Implemented - returns all the strings in uppercase invariant.
    /// </summary>
    public static string[] ToUpperAll(this string[] Strings)
    {
        string[] Result = new string[Strings.Length];
        Strings.ForEachIndex(i => Result[i] = Strings[i].ToUpperInvariant());
        return Result;
    }
Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
0

In addition to Tim's entry, in case you have padding on either side, or multiple spaces beside each other:

Int32 words = somestring.Split(           // your string
    new[]{ ' ' },                         // break apart by spaces
    StringSplitOptions.RemoveEmptyEntries // remove empties (double spaces)
).Length;                                 // number of "words" remaining
Brad Christie
  • 100,477
  • 16
  • 156
  • 200
0
using namespace;
namespace Application;
class classname
{
    static void Main(string[] args)
    {
        int count;
        string name = "I am the student";
        count = name.Split(' ').Length;
        Console.WriteLine("The count is " +count);
        Console.ReadLine();
    }
}
David
  • 3,392
  • 3
  • 36
  • 47
0

if you need whitespace count only try this.

string myString="I Love Programming";
var strArray=myString.Split(new char[] { ' ' });
int countSpace=strArray.Length-1;
0

How about indirectly?

int countl = 0, countt = 0, count = 0;

foreach(char c in str) 
{
    countt++;
    if (char.IsLetter(c)) 
    {
        countl++;
    }
}
count = countt - countl;
Console.WriteLine("No. of spaces are: "+count);
Josef
  • 2,869
  • 2
  • 22
  • 23