In my experience, when working with <stdio.h>
the precise semantics of the "eof" and "error" bits are very, very subtle, so much so that it's not usually worth it (it may not even be possible) to try to understand exactly how they work. (The first question I ever asked on SO was about this, although it involved C++, not C.)
I think you know this, but the first thing to understand is that the intent of feof()
is very much not to predict whether the next attempt at input will reach the end of the file. The intent is not even to say that the input stream is "at" the end of the file. The right way to think about feof()
(and the related ferror()
) is that they're for error recovery, to tell you a bit more about why a previous input call failed.
And that's why writing a loop involving while(!feof(fp))
is always wrong.
But you're asking about precisely when fscanf
hits end-of-file and sets the eof bit, versus getc
/fgetc
. With getc
and fgetc
, it's easy: they try to read one character, and they either get one or they don't (and if they don't, it's either because they hit end-of-file or encountered an i/o error).
But with fscanf
it's trickier, because depending on the input specifier being parsed, characters are accepted only as long as they're appropriate for the input specifier. The %s
specifier, for example, stops not only if it hits end-of-file or gets an error, but also when it hits a whitespace character. (And that's why people were asking in the comments whether your input file ended with a newline or not.)
I've experimented with the program
#include <stdio.h>
int main()
{
char buffer[100];
FILE *stream = stdin;
while(!feof(stream)) {
fscanf(stream,"%s",buffer);
printf("%s\n",buffer);
}
}
which is pretty close to what you posted. (I added a \n
in the printf
so that the output was easier to see, and better matched the input.) I then ran the program on the input
This
is
a
test.
and, specifically, where all four of those lines ended in a newline. And the output was, not surprisingly,
This
is
a
test.
test.
The last line is repeated because that's what (usually) happens when you write while(!feof(stream))
.
But then I tried it on the input
This\n
is\n
a\n
test.
where the last line did not have a newline. This time, the output was
This
is
a
test.
This time, the last line was not repeated. (The output was still not identical to the input, because the output contained four newlines while the input contained three.)
I think the difference between these two cases is that in the first case, when the input contains a newline, fscanf
reads the last line, reads the last \n
, notices that it's whitespace, and returns, but it has not hit EOF and so does not set the EOF bit. In the second case, without the trailing newline, fscanf
hits end-of-file while reading the last line, and so does set the eof bit, so feof()
in the while()
condition is satisfied, and the code does not make an extra trip through the loop, and the last line is not repeated.
We can see a bit more clearly what's going on if we look at fscanf
's return value. I modified the loop like this:
while(!feof(stream)) {
int r = fscanf(stream,"%s",buffer);
printf("fscanf returned %2d: %5s (eof: %d)\n", r, buffer, feof(stream));
}
Now, when I run it on a file that ends with a newline, the output is:
fscanf returned 1: This (eof: 0)
fscanf returned 1: is (eof: 0)
fscanf returned 1: a (eof: 0)
fscanf returned 1: test. (eof: 0)
fscanf returned -1: test. (eof: 1)
We can clearly see that after the fourth call, feof(stream)
is not true yet, meaning that we'll make that last, extra, unnecessary, fifth trip through the loop. But we can see that during the fifth trip, fscanf
returns -1, indicating (a) that it did not read a string as expected and (b) it reached EOF.
If I run it on input not containing the trailing newline, on the other hand, the output is like this:
fscanf returned 1: This (eof: 0)
fscanf returned 1: is (eof: 0)
fscanf returned 1: a (eof: 0)
fscanf returned 1: test. (eof: 1)
Now, feof
is true immediately after the fourth call to fscanf
, and the extra trip is not made.
Bottom line: the moral is (the morals are):
- Don't write
while(!feof(stream))
.
- Do use
feof()
and ferror()
only to test why a previous input call failed.
- Do check the return value of
scanf
and fscanf
.
And we might also note: Do beware of files not ending in newline! They can behave surprisingly differently.
Addendum: Here's a better way to write the loop:
while((r = fscanf(stream,"%s",buffer)) == 1) {
printf("%s\n", buffer);
}
When you run this, it always prints exactly the strings it sees in the input. It doesn't repeat anything; it doesn't do anything significantly differently depending on whether the last line does or doesn't end in a newline. And -- significantly -- it doesn't (need to) call feof()
at all!
Footnote: In all of this I've ignored the fact that %s
with *scanf reads strings, not lines. Also that %s
tends to behave very badly if it encounters a string that's larger than the buffer
that's to receive it.