4

First things first - I've got a text file in which there are binary numbers, one number for each row. I'm trying to read them and sum them up in a C++ program. I've written a function which transforms them to decimal and adds them after that and I know for sure that function's ok. And here's my problem - for these two different ways of reading a text file, I get different results (and only one of these results is right) [my function is decimal()]:

ifstream file;
file.open("sample.txt");
int sum = 0;
string BinaryNumber;
while (!file.eof()){
    file >> BinaryNumber;
    sum+=decimal(BinaryNumber);
}

and that way my sum is too large, but by a small quantity.

ifstream file;
file.open("sample.txt");
int sum = 0;
string BinaryNumber;
while (file >> BinaryNumber){
    sum+=decimal(BinaryNumber);
}

and this way gives me the the right sum. After some testing I came to a conclusion that the while loop with eof() is making one more iteration than the other while loop. So my question is - what is the difference between those two ways of reading from a text file? Why the first while loop gives me the wrong result and what may be this extra iteration that it's doing?

Captain Obvlious
  • 19,754
  • 5
  • 44
  • 74
qiubit
  • 4,708
  • 6
  • 23
  • 37
  • `file >> variable` checks the `file`'s state after input. Whenever you have wrong input, or you hit the EOF, the condition is `false`. – πάντα ῥεῖ Apr 19 '14 at 20:29
  • Your first example fails because `eof()` returns that status of the previous read operation and does not return false until _after_ you have read the end of the stream. – Captain Obvlious Apr 19 '14 at 20:31

2 Answers2

7

The difference is that >> reads the data first, and then tells you whether it has been a success or not, while file.eof() does the check prior to the reading. That is why you get an extra read with the file.eof() approach, and that read is invalid.

You can modify the file.eof() code to make it work by moving the check to a place after the read, like this:

// This code has a problem, too!
while (true) {            // We do not know if it's EOF until we try to read
    file >> BinaryNumber; // Try reading first
    if (file.eof()) {     // Now it's OK to check for EOF
        break;            // We're at the end of file - exit the loop
    }
    sum+=decimal(BinaryNumber);
}

However, this code would break if there is no delimiter following the last data entry. So your second approach (i.e. checking the result of >>) is the correct one.

EDIT: This post was edited in response to this comment.

Community
  • 1
  • 1
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • 1
    Do they both check `eof`, or does the `>>` version check `bad`? – user541686 Apr 19 '14 at 20:31
  • 2
    @Mehrdad The `>>` action depends on whether or not you have C++11: before C++11, it called `operator void*`; starting with C++11, it calls `operator bool()` ([link](http://stackoverflow.com/a/8117635/335858)). – Sergey Kalinichenko Apr 19 '14 at 20:34
  • 1
    Actually, checking for `eof()` _after_ reading the input is **also wrong**! This time the error is in the opposite direction: if the last value isn't followed by a space it isn't processed. The only correct approach is to check the status of `file`, i.e., either the conversion to a boolean value or `fail()`. – Dietmar Kühl Apr 19 '14 at 20:35
  • 1
    @dasblinkenlight: Ah okay, though that wasn't my point. It seems like the boolean conversion uses `fail` instead of `eof` so it's not quite the same semantics in the first place. – user541686 Apr 19 '14 at 20:35
  • @DietmarKühl Thanks for a great comment! I did not realize that the "fix" is broken (I use the conversion to `bool`/ `void*` in my code, but I didn't realize that the eof check in the middle was broken). I edited the answer to explain this. – Sergey Kalinichenko Apr 19 '14 at 22:43
2

When using file.eof() to test the input, the last input probably fails and the value stays unchanged and is, thus, processed twice: when reading a string, the stream first skips leading whitespace and then reads characters until it finds a space. Assuming the last value is followed by a newline, the stream hasn't touched EOF, yet, i.e., file.eof() isn't true but reading a string fails because there are no non-whitespace characters.

When using file >> value the operation is executed and checked for success: always use this approach! The use of eof() is only to determine whether the failure to read was due to EOF being hit or something else.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380