2

I am trying to input all words into a map in C++, but the program freezes only when a word starts with a special character. The code works when there is a special character at the end.

I haven't been able to find proper documentation for the >> operator in C++, or been able to google my problem properly.

//Map values and find max value
//The code works for all words except the ones that start with special characters

    while(myFile >> cWord){

        //put the characters into a string

        //DEBUG: cout << "real word: " << cWord << " | ";

                cWord = stripWord(cWord);

                //delete common words before they're in the system

        if(cWord == "a" ||
           cWord == "an" ||
           cWord == "and" ||
           cWord == "in" ||
           cWord == "is" ||
           cWord == "it" ||
           cWord == "the"){
            continue;
        }

        if (wordMap.count(cWord) == 0){
            wordMap.insert({cWord, 1});
        }
        else{
            wordMap[cWord]++;
            if(wordMap[cWord] > maxWordRep){
                maxWordRep = wordMap[cWord];
            }
        }
        //DEBUG: cout << cWord << " | " << wordMap[cWord] << endl;

    }

I expect the debug to print all of the words and then follow the rest of the code, but the code stops running and freezes at the exact line

while(myFile >> cWord)

My input are long song lyric files. Here are the words that program froze on:

Counting Stars: Completed.

I can make your hands clap: Stuck at 'cause

One more night: Stuck at (yeah

Run on test (a file to test combined words): Completed

Safety Dance: Stuck at 'em

Shake it off: Stuck at "oh

There are a bunch of others that follow the same pattern. Always 1 or more special characters in front. You can try on your own, and cin >> string will get stuck when you input a string with a special character in front.

Final Edit: The bug was in the stripWord function, so this question is just a bad question.

Kayra
  • 39
  • 3
  • 1
    What kind of special characters are you talking about? Can you give an example? – eesiraed Apr 29 '19 at 04:39
  • Assuming what `wordMap` is something like `std::map`, then you don't need the if-else, just do `wordMap[cWord]++`. If the key doesn't exist in the map, then it will be inserted with a default-initialized data value which will be zero. – Some programmer dude Apr 29 '19 at 04:40
  • And please read about [how to ask good questions](http://stackoverflow.com/help/how-to-ask), as well as [this question checklist](https://codeblog.jonskeet.uk/2012/11/24/stack-overflow-question-checklist/). – Some programmer dude Apr 29 '19 at 05:17
  • @FeiXiang ,./;'[]\<>?:"{}|!@#$%^&*()_+ and the like. – Kayra Apr 29 '19 at 05:21
  • 2
    Those are *normal* characters when reading into a string. Perhaps the problem isn't what (and where) you think it is? Have you tried to step through the program, line by line, in a debugger? Where does the "freezing" happen? And please edit your question to show us a small example of the input file (including input that causes your problem), And of course if possible please try to create a [mcve] to show us. – Some programmer dude Apr 29 '19 at 05:28
  • @Someprogrammerdude Well, you can see on my code where I had my lines. My input are long song lyric files. Here are the words that program froze on: | Counting Stars: Completed. I can make your hands clap: Stuck at | 'cause | One more night: Stuck at | (yeah | Run on test (a file to test combined words): Completed Safety Dance: Stuck at | 'em | Shake it off: Stuck at | "oh | There are a bunch of others that follow the same pattern. Always 1 special character in front. You can try on your own, and cin >> string will get stuck when you input a string with a special character in front. – Kayra Apr 29 '19 at 06:37
  • Please edit your question to include this information. Crucial information (like file contents) should be inside the question itself and not in comments. It's also easier to format such input like it actually is. – Some programmer dude Apr 29 '19 at 06:39
  • @Someprogrammerdude I added it, thanks – Kayra Apr 29 '19 at 07:07
  • There's just nothing in the code you show, or with the words you mention, that should cause the behavior you claim. Without *actual* input and a proper [mcve] (which both replicates the behavior) there's really nothing more we can do but guess wildly. – Some programmer dude Apr 29 '19 at 07:11

1 Answers1

0

This code:

while(myFile >> cWord)

The >> operator returns a std::istream&, and so the operator called here is: http://www.cplusplus.com/reference/ios/ios/operator_bool/

Notice it says it is looking for a failbit to be set on the istream? Reading to the end of a file is NOT an error, so really you should be checking to see if you've hit the end of a file, e.g.

while(!myFile.eof())
{
  myFile >> cWord;

  /* snip */
}

If the file has a bunch of pointless whitespace at the end of the file, you might end up reading an empty string at the end of the file, which should also be taken care of, e.g.

while(!myFile.eof())
{
  myFile >> cWord;

  if(cWord.empty()) break;

  /* snip */
}

The rest of the code (assuming it's bug free) should be fine

robthebloke
  • 9,331
  • 9
  • 12
  • 3
    [Why is iostream::eof inside a loop condition considered wrong?](https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong) – Some programmer dude Apr 29 '19 at 04:41
  • Yes. That's exactly why he has a problem. When there are characters after the last text read (space, tab, etc), then the final read will cause the failbit to be set (because those characters failed to be read into a std::string). The call to operator bool only reports the failbit or badbit, and never eofbit. HOWEVER, when the last read ends on eof (i.e. no whitespace and end of file), the eofbit will be set (and the next read will simply hang whilst waiting for the file to be appended to). To be safe, yes you do need to check for eof in the loop. This is explained in the docs for istream. – robthebloke Apr 29 '19 at 04:57
  • 1
    This didn't fix the problem. The code works completely on files that have no symbols, including files with whitespaces or tabs. The link @Someprogrammerdude provided suggests the opposite of what you're recommending. – Kayra Apr 29 '19 at 05:24
  • 2
    Please, read [Why is iostream::eof inside a loop condition considered wrong?](https://stackoverflow.com/questions/5605125/why-is-iostreameof-inside-a-loop-condition-considered-wrong). `while (std::cin >> word)` is the correct idiom. What you suggest in your answer is considered as wrong and possibly failing or prone to ending in infinite loops. – Scheff's Cat Apr 29 '19 at 05:44
  • As shown in the linked question, the OP is reading the file correctly and you're not. Reading to the end of a file will set the failbit if no characters are extracted as a result of reaching the EOF. – eesiraed Apr 30 '19 at 01:43