3

This small custom getline function was given as an answer to a question about handling different line endings.

The function worked great until it was edited 2 days ago to make it not skip leading white spaces for each line. However, after the edit, the program now goes into an infinite loop. The only change done to the code was this following line which was changed from this:

std::istream::sentry se(is);  // When this line is enabled, the program executes
                              // correctly (no infinite loop) but it does skip
                              // leading white spaces

to this:

std::istream::sentry se(is, true); // With this line enabled, the program goes 
                                   // into infinite loop inside the while loop  
                                   // of the main function.

Can someone please help me explain why the program loops infinitely if we specify to not skip white spaces?

Here is the full program...

std::istream& safeGetline(std::istream& is, std::string& t)
{
    t.clear();

    // The characters in the stream are read one-by-one using a std::streambuf.
    // That is faster than reading them one-by-one using the std::istream.
    // Code that uses streambuf this way must be guarded by a sentry object.
    // The sentry object performs various tasks,
    // such as thread synchronization and updating the stream state.

    std::istream::sentry se(is, true);
    std::streambuf* sb = is.rdbuf();

    for(;;) {
        int c = sb->sbumpc();
        switch (c) {
        case '\r':
            c = sb->sgetc();
            if(c == '\n')
                sb->sbumpc();
            return is;
        case '\n':
        case EOF:
            return is;
        default:
            t += (char)c;
        }
    }
}

And here is a test program:

int main()
{
    std::string path = "end_of_line_test.txt"

    std::ifstream ifs(path.c_str());
    if(!ifs) {
        std::cout << "Failed to open the file." << std::endl;
        return EXIT_FAILURE;
    }

    int n = 0;
    std::string t;
    while(safeGetline(ifs, t))   //<---- INFINITE LOOP happens here. <----
        std::cout << "\nLine " << ++n << ":" << t << std::endl;

    std::cout << "\nThe file contains " << n << " lines." << std::endl;
    return EXIT_SUCCESS;
}

I also tried to add this line at the very beginning of the function but it made no difference... the program still looped infinitely in the while loop of the main function.

is.setf(0, std::ios::skipws);

The file end_of_line_test.txt is a text file which contains only the following two lines:

   "1234" // A line with leading white spaces
"5678"    // A line without leading white spaces
Community
  • 1
  • 1
Tech_Guy
  • 53
  • 4
  • @templatetypedef I am using the latest version of the command line compiler for visual c++ running on windows xp 32-bit. If I change the sentry line to "std::istream::sentry se(is);" the program works, but if I change it back to "std::istream::sentry se(is, true);" the program crashes. In your test file, do you have leading spaces in any of the lines? – Tech_Guy Feb 08 '12 at 05:28
  • What specific crash are you getting? Can we get a stack trace? – templatetypedef Feb 08 '12 at 05:29
  • I changed the line inside the while loop in order to demonstrate the crash. I get an endless print outs showing the line number and the digit zero for the contents of t. I do not know how to do a stack trace (beginner, sorry). – Tech_Guy Feb 08 '12 at 05:34
  • @templatetypedef I'm very sorry, I'm getting an infinite loop. I will edit my question to make this correction. – Tech_Guy Feb 08 '12 at 05:38
  • No need to apologize! But thanks for making that clarification... it's really going to help people zero in on the bug. – templatetypedef Feb 08 '12 at 05:40
  • @templatetypedef do you not get an infinite loop if the sentry line with the "true" parameter is passed? If not, does your input file contain any lines with leading white spaces? – Tech_Guy Feb 08 '12 at 05:45
  • Can there be a case where the `safeGetline` never returns? Your question might probably be solved by using a debugger to step through the code. – Some programmer dude Feb 08 '12 at 07:17
  • 1
    I have updated the code in my original answer http://stackoverflow.com/questions/6089231/getting-std-ifstream-to-handle-lf-cr-and-crlf/6089413#6089413 Now everything works. – Johan Råde Dec 27 '12 at 09:12

1 Answers1

6

The problem is that safeGetLine never sets the eof() state for the stream.

When you use std::istream::sentry se(is);, it will try to read whitespace and discover that you are at end-of-file. When you ask it not to look for whitespace, this never happens.

I believe you should add is.setstate(ios_base::eofbit) to the EOF condition for the function.

Bo Persson
  • 90,663
  • 31
  • 146
  • 203
  • Thanks so much for your answer. I have added `is.setstate(ios_base::eofbit)` between the line `case EOF:` and `return is;` in the getline function. However, now the program stops after printing one line. Would I need to clear the stream state bits in the beginning of the function so that it executes everytime? Or is that dangerous? – Tech_Guy Feb 08 '12 at 19:47
  • You would have to separate the EOF case from the `'\n'` case, because they are (now) different. – Bo Persson Feb 08 '12 at 19:51
  • @Bo... THANK YOU SO MUCH, you answers solved the problem PERFECTLY!!!!!! I am now able to get the behavior I was looking for. – Tech_Guy Feb 08 '12 at 23:46
  • Excellent topic. I've been following the same post. @Bo, could you give an example of "separate the cases"? What should I put after case '/n':? I was thinking "return is;" funny though. It seems to be working properly without making any modification there. Thanks – Miek Dec 26 '12 at 04:29
  • @Miek - I think it is system dependent whether it works or not. On Windows you would use the `\r` branch for line breaks, but on Linux you end up in the `\n` branch. By "separate the cases" I just meant that `\n` and `EOF` are separate, and therefore need separate code. – Bo Persson Dec 26 '12 at 08:08
  • @Bo. Yeah I knew what you meant as far as that, I was wondering what the code might look like for the n/ case.(do nothing or return is; or sb->sgetc(); return is; etc) The Users of the code I am working on will always be on an intel mac, but where they get the files that read in could be from any platform. The do nothing option seems to be working. Is that concerning? Thanks for the help. – Miek Dec 26 '12 at 17:47
  • I have updated my code http://stackoverflow.com/questions/6089231/getting-std-ifstream-to-handle-lf-cr-and-crlf/6089413#6089413 following your suggestion. – Johan Råde Dec 27 '12 at 09:14