Why EOF coincides with valid char value?

Question

As said in comments to the answer to this question: Why gcc does not produce type mismatch warning for int and char?

both -1 and 255 are 0xFF as 8 bit HEX number on any current CPU.

But EOF is equal to -1. This is a contradiction, because the value of EOF must not coincide with any valid 8-bit character. This example demonstrates it:

#include <stdio.h>
int main(void)
{
  char c = 255;
  if (c == EOF) printf("oops\n");
  return 0;
}

On my machine it prints oops.

How this contradiction can be explained?

`EOF` isn't a character but a state. Comparing it to a `char` does not make sense. — alk, Sep 27 '16 at 08:46
"*the value of EOF must not coincide with any valid 8-bit character.*" why not? Think twice. — alk, Sep 27 '16 at 08:48
`EOF` is an `int`, and all the functions that can return `EOF` return `int`, not `char`. Also, your program only prints `oops` if the implementation's `char` is signed. — molbdnilo, Sep 27 '16 at 08:48
`int` can represent 255 and -1 differently. There is no collision. — doug65536, Sep 27 '16 at 08:49
@alk because if we read characters from file, checking for EOF, then a valid character may indicate false end of file — Igor Liferenko, Sep 27 '16 at 09:19
@IgorLiferenko: ***All C's `char` reader functions which might indicate `EOF` do not return a `char` but an `int`***, to which the function's result should be assigned to (if you don't you loose info, by loosing certain bits). Then the result should be tested against `EOF` and only if the latter test failed the result may be used as/assigned to a `char`-variable. — alk, Sep 27 '16 at 09:35
Have look at the bit pattern of a signed integer (with more bits then a `char`) that carries `-1`. You then will see where the end-of-file info is "hidden", is stored, is returned (and that it does not affect the possibility to still store 2^`char`-bit-width different character values in it's lower `char`-bit-width bits). — alk, Sep 27 '16 at 09:42
Enable warnings, including `-Wconversion` and make sure you use `unsigned char`. — too honest for this site, Sep 27 '16 at 09:52
@alk: because `signed char` _can_ represent `-1` (but most likely not `255`). OP complained about not getting a diagnostic message. That should change it. — too honest for this site, Sep 27 '16 at 10:05
@Olaf: Ah, for the snippet shown! Yes, sure. I thought for something like `char c = getchar();`. For the latter a type conversion warning is expected for signed as well as for unsigned `char`s. — alk, Sep 27 '16 at 10:23
@alk: Is it just me or is today "what is this conversion?" day? I really wonder what teachers actually teach their students. Apparently not the fundamentals — too honest for this site, Sep 27 '16 at 10:27

score 2 · Accepted Answer · answered Sep 27 '16 at 08:48

2

When you compare an int value to a char value, the char value is promoted to an int value. This promotion is automatic and part of the C language specification (see e.g. this "Usual arithmetic conversions" reference, especially point 4). Sure the compiler could give a warning about it, but why should it if it's a valid language construct?

There's also the problem with the signedness of char which is implementation defined. If char is unsigned, then your condition would be false.

Also if you read just about any reference for functions reading characters from files (for example this one for fgetc and getc) you will see that they return an int and not a char, precisely for the reasons mentioned above.

answered Sep 27 '16 at 08:48

Some programmer dude

400,186
35
402
621

If it's a valid construct, then it turns out that EOF belongs to valid character range, which it can't. Char and int *cannot have the same value*, even if it is the result of hidden promotion. If, for example, EOF was made to be 65535, the compiler would give a warning. For what reasons EOF is `-1`? – Igor Liferenko Sep 27 '16 at 09:13
@IgorLiferenko *Why* can't an `int` and a `char` have the same value? Take the value `97` for example. No matter if it's stored in a `char`, a `short`, an `int` or even a `long long` it is stored as `97`, with the bits `01100001`. Multiple types, same value, stored exactly the same way. That the value `97` also happens to be the ASCII code for `'a'` doesn't matter, it's still stored as `97`. – Some programmer dude Sep 27 '16 at 09:27
Instead of "cannot" I wanted to say "must not". Anyway, if `EOF` was made `INT_MAX`, everything would be perfect. Obviously, it was a stupidity to make it `-1`. – Igor Liferenko Apr 03 '17 at 02:51

Why EOF coincides with valid char value?

1 Answers1

Linked