0

I have come across a lot of counting words examples (like the one in the link below):

Counting words in a string - c programming

if(str[i]==' ')
{
    i++;
}

and for digit is:

if(str[i]>='0' && str[i]<='9')
{
    i++;
}

but what if the input were 'I have 12 apples.' and I only want the output to show "word count = 3"?

gsamaras
  • 71,951
  • 46
  • 188
  • 305
BEX
  • 187
  • 4
  • 21
  • 3
    You'll need to [tokenise](https://stackoverflow.com/questions/266357/tokenizing-strings-in-c) the input, then count how many tokens are "words" (or at least, consist of entirely alphabetical characters, or whatever else your classification strategy is). – hnefatl Sep 12 '17 at 09:16
  • 2
    You could have a look at `strtok` – TDk Sep 12 '17 at 09:16
  • 4
    What if the word starts with a number, like `123hello`, or contains a number (`he123llo`), should it be counted then or not? – vgru Sep 12 '17 at 09:16
  • 4
    Then you need to think about it. And BTW, neither of the two conditions above is good for what they claim to do. Consider double spaces and what not. – StoryTeller - Unslander Monica Sep 12 '17 at 09:19
  • 1
    You dont need to explicity tokenise the input. A char-by-char state-machine would also be fine. – Martin James Sep 12 '17 at 09:36

3 Answers3

2

Assuming that you don't have words that contain alphanumeric combinations, like "foo12", then you could combine your code snippets, like this:

#include <stdio.h>
#include <string.h>

int main(void)
{
    char str[] = "Bex 67 rep";
    int len = strlen(str);
    int count = 0, i = 0;
    while(str[i] != '\0')
    {
        if(str[i] == ' ')
        {
            if(i + 1 < len && ! (str[i + 1] >= '0' && str[i + 1] <= '9') && str[i + 1] != ' ')
                count++;
        }
        i++;
    }
    printf("Word count = %d\n", count + 1); // Word count = 2
    return 0;
}

where you loop over every character of the string, and when you find a whitespace, you check - if you are not at the last character of the string - if the next character is not a digit or a whitespace. If that's the case, then you can assume that the whitespace you encountered is precedended of a word, thus incease count.

Notice however that usually senteces do not start with a whitespace (which is an extra assumption for this answer), thus the number of words is one more than count.


In real life, use strtok() and check every token for its validity, since that's approach is just for demonstration and should be considered a bad approach.

gsamaras
  • 71,951
  • 46
  • 188
  • 305
0
#include <stdio.h>
#include <string.h>

int main ()
{
    char str[] ="I have 12 apples";
    char * pch;
    unsigned long ul;
    int cnt=0;

    pch = strtok (str," ,.-");
    while (pch != NULL)
    {
        ul = strtoul (pch, NULL, 0);
        pch = strtok (NULL, " ,.-");
        printf("%d\n", ul);
        if(ul == 0)
            cnt++;
    }
    printf("count is %d\n", cnt);
    return 0;
}

String tokens parsed using strtok function.

Rajeshkumar
  • 739
  • 5
  • 16
0

My five cents.:)

#include <stdio.h>
#include <ctype.h>

size_t count_words( const char *s )
{
    size_t n = 0;

    const char *p = s;

    while ( 1 )
    {
        int pos = 0;

        sscanf( p, "%*[ \t]%n", &pos );
        p += pos;

        if ( sscanf( p, "%*s%n", &pos ) == EOF ) break;

        if ( isalpha( ( unsigned char )*p ) ) ++n;

        p += pos;
    }


    return n;
}

int main(void) 
{
    char s[] = "I have 12 apples";

    printf( "The number of words is %zu\n", count_words( s ) );

    return 0;
}

The program output is

The number of words is 3

And my advice is do not use the standard function strtok for such a task. First of all it may not deal with string literals. And it has a side effect of changing the original string.:)

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335