1

In section 7.19.7.1 of C99, we have:

If the end-of-file indicator for the input stream pointed to by stream is not set and a next character is present, the fgetc function obtains that character as an unsigned char converted to an int and advances the associated file position indicator for the stream (if defined).

As I understood it, int type can have the same width as an unsigned char. In such a case, can we conclude that fgetc would only function correctly if int width > CHAR_BIT.

(with reference to the comment by blagovest), does C99 specify when the standard library is to be expected, or whether a conforming implementation can implement part but not all of the standard library?

tyty
  • 839
  • 5
  • 12
  • "int type can have the same width as an unsigned char" - I really, really doubt that and I would be really shocked, if there's a platform, where this is true. Where did you read this? Or you understand this from that quote? Because it doesn't say such thing. – Kiril Kirov Nov 15 '11 at 09:26
  • 1
    Please don't place two questions in one. Open another one for your question about `int32_t`. – Jens Gustedt Nov 15 '11 at 09:29
  • @KirilKirov see the 24 bit machine described in http://stackoverflow.com/q/8007825/1016492 – tyty Nov 15 '11 at 09:30
  • The issue is not about object "widths", but with ranges. If `int` ranges are -32767 to 32767 and `unsigned char` ranges are from 0 to 255 (or 32767) there will be no problem with `fgetc`. **I mean range of characters in the file system, at the Operating System level** – pmg Nov 15 '11 at 09:37
  • @pmg, barring one special case, knowing the width tells one the range, and vice versa – tyty Nov 15 '11 at 09:40
  • @tyty: in the case of `sizeof(char) == sizeof(int) == sizeof(void *) == 1`, I highly doubt there will be files, an underlying operating system, much less a standard library, so this shouldn't be a concern for non-embedded code. – Blagovest Buyukliev Nov 15 '11 at 09:40
  • @tyty: right, I edited my comment to better reflect what I meant. Anyway, the width and range of objects are not necessarily (nearly) equivalent: there can be padding bits in the width. – pmg Nov 15 '11 at 09:44
  • @pmg C99 defines the precision of a signed integer to be the number of value bits and the width (of signed integer types) to be the precision + 1. So width does not count padding bits. See 6.2.6.2/6 – tyty Nov 15 '11 at 09:49
  • Ok; I didn't mean "width" as defined by the Standard (6.2.6.2): I meant it as size in bits as reported by `CHAR_BIT * sizeof (int)`. – pmg Nov 15 '11 at 10:15

2 Answers2

3

fgetc returns EOF on an end-of-file or error condition.

Otherwise, it returns the character that was read, as an unsigned char, converted to int.

Suppose CHAR_BIT == 16 and sizeof (int) == 1, and suppose the next character read has the value 0xFFFF. Then fgetc() will return 0xFFFF converted to int.

Here it gets a little tricky. Since 0xFFFF can't be represented in type int, the result of the conversion is implementation-defined. But typically, the result will be -1, which is a typical value (in fact, the only value I've ever heard of) for EOF.

So on such a system, fgetc() can return EOF even if it successfully reads a character.

There is no contradiction here. The standard stays that fgetc() returns EOF at end-of-file or on an error. It doesn't say the reverse; returning EOF doesn't necessarily imply that there was an error or end-of-file condition.

You can still determine whether fgetc() read an actual character or not by calling feof() and ferror().

So such a system would break the typical input loop:

while ((c = fgetc()) != EOF) {
    ...
}

but it wouldn't (necessarily) fail to conform to the standard.

(with reference to the comment by blagovest), does C99 specify when the standard library is to be expected, or whether a conforming implementation can implement part but not all of the standard library?

A "hosted implementation" must support the entire standard library, including <stdio.h>.

A "freestanding implementation" needn't support <stdio.h>; only standard headers that don't declare any functions (<limits.h>, <stddef.h>, etc.). But a freestanding implementation may provide <stdio.h> if it chooses.

Typically freestanding implementations are for embedded systems, often with no operating system.

In practice, every current hosted implementation I'm aware of has CHAR_BIT==8. The implication is that in practice you can probably count on an EOF result from fgetc() actually indicating either end-of-file or an error -- but the standard doesn't guarantee it.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
0

Yes on such a platform there would be one unsigned char value that would not be distinguishable from EOF.

unsigned char is not allowed to have padding bytes, so the set of values for unsigned char would be a superset of the possible values for int.

The only hope on such a platform that one could have is that at least char would be signed, so EOF wouldn't clash with the positive char values.

This would probably not be the only problem that such a platform would have.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177