The trouble is that fgetc()
and its relatives return an int
, not a char
:
If the end-of-file indicator for the input stream pointed to by stream is not set and a
next character is present, the fgetc
function obtains that character as an unsigned char
converted to an int
and advances the associated file position indicator for the
stream (if defined).
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-
file indicator for the stream is set and the fgetc
function returns EOF
.
It has to return every possible valid character value and a distinct value, EOF
(which is negative, and usually but not necessarily -1
).
When you read the value into a char
instead of an int
, one of two undesirable things happens:
If plain char
is unsigned, then you never get a value equal to EOF, so the loop never terminates.
If plain char
is signed, then you can mistake a legitimate character, 0xFF (often ÿ, y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS) is treated the same as EOF, so you detect EOF prematurely.
Either way, it is not good.
The Fix
The fix is to use int c;
instead of char c;
.
Incidentally, the fopen()
call should not compile:
FILE *f = fopen('/path/to/some/file', 'rb');
should be:
FILE *f = fopen("/path/to/some/file", "rb");
Always check the result of fopen()
; of all the I/O functions, it is more prone to failure than almost any other (not through its own fault, but because the user or programmer makes a mistake with the file name).