2

I'm a beginner at c++(took a couple classes, then no c++ for a while, then starting back up several months later), and I'm trying to count the number of words in a simple sentence and then count the number of numbers in that same sentence. To count the words, I use:

int countWord(char *word)
    {
     int counter = 0;
     char *words = strtok(word, " 0123456789-");

     while (words != NULL)
        {
            counter++;
            words = strtok(NULL, " 0123456789-");
        }
    return counter;
    }

The number counter is basically the same, just instead of using integers I use the alphabet.

char *num = strtok(number, " abcdefghijklmnopqrstuvwxyz");

My main is:

int main()
{
    char str[] = "what time is 88 it 99today";

    cout << "words = " << countWord(str) << " " << "numbers = " <<
    countNum(str) << endl;          

    system("pause");

    return 0;
}

When I run this it outputs: words = 3 numbers = 2.

When i rearrange main to:

    char str[] = "what time is 88 it 99today";

    cout << "words = " << countWord(str) << " ";
    cout << "numbers = " << countNum(str) << endl;

output is: words = 5 numbers = 0

Can anyone explain why this is incorrect? Also, if anyone can refer me to a text that covers this, I'd appreciate that. The text I learned from is: "C++ Programming: Program Design Including Data Structures by D.S. Malik. I didn't see any techniques in this book to count "words". Thank you.

Justin M.
  • 493
  • 4
  • 9
  • 3
    Seem like this would be a lot easier if you used C++ strings instead of C strings. – MrEricSir Jan 25 '16 at 23:00
  • It looks more C than C++ to me. – Ely Jan 25 '16 at 23:00
  • just the literal meaning, that is, "word" (in this case) being "what", "time", "is", "it", "today". – Justin M. Jan 25 '16 at 23:02
  • I've yet to look at C. In the two classes I took, neither went over how to count words or numbers, just characters. When I looked for how to do this, I only found strtok(), and the explanation was limited (unless that limited explanation is all it needs). – Justin M. Jan 25 '16 at 23:06
  • @JustinMangawang You're not even programming in C++. This is more like C compiled with a C++ compiler. If you're going to do this in C++, use C++ libraries. – CinchBlue Jan 25 '16 at 23:11
  • @JustinMangawang See here: http://ideone.com/Gyjp7v – PaulMcKenzie Jan 25 '16 at 23:20
  • Thank you Paul, however, when I run your code, I'm getting the word count as 4 and the numbers as 1. I can't seem to get: words = 5 numbers = 2. – Justin M. Jan 25 '16 at 23:26
  • @JustinMangawang 99today is a word, isn't it? – erip Jan 25 '16 at 23:30
  • with 99today, I am trying to have 99 counted as one number and today as the word. With the countWord() function by itself, it counts 5 words, but when I add countNum(), it doesn't count all 5 words and never counts all of the numbers. – Justin M. Jan 25 '16 at 23:33
  • @JustinM. And the fix is to use `any_of` instead of `all_of`. http://ideone.com/uOwSlm All of this stuff with `strtok`, `strdup`, etc. is not necessary. And as to your description -- what if the "word" is `"1abc23xyz"`? How many words and numbers is there in that string? – PaulMcKenzie Jan 26 '16 at 09:51

2 Answers2

2

The issue is that strtok marks the end of tokens in the original string by a null character. Citing from cppreference:

If such character was found, it is replaced by the null character '\0' and the pointer to the following character is stored in a static location for subsequent invocations.

Notes: This function is destructive: it writes the '\0' characters in the elements of the string str. In particular, a string literal cannot be used as the first argument of strtok.

In your case the line

cout << "words = " << countWord(str) << " " << "numbers = " <<
countNum(str) << endl;

is a composition of operator<<, like

...operator<<(operator<<(cout, "words"), countWord(str))...

so the line countNum(str) is evaluated first. Then countWord(str) is evaluated secondly. This is in contrast to

cout << "words = " << countWord(str) << " ";
cout << "numbers = " << countNum(str) << endl;

where the other way around happens.

One solution is to use a copy of the original string when using strtok, e.g. use strtok(strdup(str)) every time. Better yet, use standard C++ library features, like std::string, std::count_if etc. I'm sure there are plenty of word counting solutions around using pure C++.

vsoftco
  • 55,410
  • 12
  • 139
  • 252
  • Thank you, I had been wondering if my error comes from my use of str with strtok() (as stated, I'm just learning on my own about strtok). I will research more into how to accomplish this. – Justin M. Jan 25 '16 at 23:30
  • @JustinMangawang You're welcome. A quick and dirty solution is to use `strtok(strdup(str))` every time, as you will work with a copy of your string. But try to learn proper C++ library features and use `std::string` in combination with algorithms instead. – vsoftco Jan 25 '16 at 23:37
  • The strdup() works. I'll definitely be looking into those features more. So far, from what I read, by having "using namespace std;", I neglected the use of "std::string", but I'm much more inclined to take your advice and the other advice here. – Justin M. Jan 25 '16 at 23:47
  • Please, please, please do not use "using namespace std". http://stackoverflow.com/questions/1452721/why-is-using-namespace-std-considered-bad-practice – Mailerdaimon Jan 26 '16 at 06:45
  • @Mailerdaimon ? I am not using any, it is just an example with code pasted from OP question. – vsoftco Jan 26 '16 at 07:47
  • @vsoftco The comment was meant as a direct reply to Justins comment. I forgot the `@Justin M.`, sorry for the confusion. – Mailerdaimon Jan 26 '16 at 08:03
1

Vlad has submitted a nice answer for your C-style code. My answer is demonstrating use of more C++ libraries to help move things along:

#include <iostream>
#include <string>
#include <vector>
#include <regex>

int main() {
    // The main string.
    std::string str = "what time is 88 it 99today"; 
    // Space is your delimiter
    std::string delimiter = " ";
    // Create a regex string for matching numbers, including floating point.
    std::regex number_chars(std::string("[0123456789.]+"));

    // A lambda function to help tokenize strings.
    // Returns a vector of substring tokens.
    // The internal code is taken from an answer on Stack Overflow.
    auto tokenizer = [](std::string s, std::string delimiter) {
        size_t pos = 0;
        std::string token;
        std::vector<std::string> tokens;

        while (pos = (s.find(delimiter))) {
            token = s.substr(0, pos);
            tokens.push_back(token);
            s.erase(0, pos + delimiter.length());

            if (pos == std::string::npos)
                break;

        }
        return tokens; 
    };

    // Apply the lambda.
    auto tokens = tokenizer(str, delimiter);

    // Output your tokens.
    for (auto it : tokens) {
        std::cout << it << "\n";    
    } std::cout << "\n";

    // Output tokens that are numbers.
    for (auto it : tokens) {
        if (std::regex_match(it, number_chars)) {
            std::cout << "String: " << it << " is a number.\n";    
        }
    }
    return 0;
}

Since C++ has a regular expression library in C++11, it would be good to leverage it.

Coliru: http://coliru.stacked-crooked.com/a/43cd6711e1243f4a

CinchBlue
  • 6,046
  • 1
  • 27
  • 58
  • Thank you for your explanations. I'm going to study a lot of what you have put in here (such as the regex). This has introduced a lot of new material to me. – Justin M. Jan 25 '16 at 23:45
  • @JustinM. A good C++ guide would at least direct you to use the `std::string` class for string manipulation. You shouldn't be using C strings for most things (unless maybe you're working with Unicode? But there's libraries for that...). – CinchBlue Jan 26 '16 at 06:38