3

Why are islower() and friends required to handle EOF, whereas putchar() and friends don't have to?

Why isn't islower() treating int as unsigned char, as it is the case in putchar()? This would make total sense, because we have to check for EOF first anyway. See also Why the argument type of putchar(), fputc(), and putc() is not char?

Toby Speight
  • 27,591
  • 48
  • 66
  • 103
Igor Liferenko
  • 1,499
  • 1
  • 13
  • 28
  • `unsigned char` is a thing of ancient times. With it's length of 8 bit, is not capable of working with non-western langauges, with accented letters, emojis etc. – Codo Nov 21 '16 at 09:44
  • @Codo: and `islower()` does, in the standard library? – Jongware Nov 21 '16 at 15:34
  • @RadLexus: I'm not really sure. I think it does if the locale is set correctly. – Codo Nov 21 '16 at 15:56

2 Answers2

4

because we have to check for EOF first anyway.

We absolutely don't.

int c;
while(isspace(c=fgetc(fp)));
if (c==EOF) ...

This is totally legitimate code to skip whitespaces. Checking each character for EOF separately is a waste of time.

The ctype functions are made to handle EOF specifically to enable code like this.

See also this question.

Community
  • 1
  • 1
n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • Looking at [a random implementation](http://research.microsoft.com/en-us/um/redmond/projects/invisible/include/ctype.h.htm) of `isspace` & friends suggests that `-1` is *explicitly* taken care of: `#define isspace(c) ((__ctype+1)[(unsigned int)c]&_S)`. Here, `__ctype` is a simple array, but note the rather sneaky `+1`! – Jongware Nov 21 '16 at 15:38
  • "The ctype functions are made to handle EOF specifically to enable code like this." - where this is written in the standard? This is just your opinion. Things are more rational than that. – Igor Liferenko Nov 23 '16 at 01:40
  • 1
    @IgorLiferenko The standard rarely mentions rationales for its decisions, you need to go elsewhere to find them. Folk wisdom is one source. – n. m. could be an AI Nov 23 '16 at 05:35
1

None of character type functions are required to handle EOF, other than ignoring it (i.e. returning false). In fact, EOF marker is not even mentioned in <ctype.h> header documentation.

The most likely reason for character classification function signatures to use int in place of char, signed or unsigned, is to avoid implementation-defined behavior in loops like this:

int c;
while ((c =getchar()) != EOF) {
    if (islower(c)) {
        ...
    } else if (isdigi(c)) {
        ...
    }
}

This would compile and run with islower(char) instead of islower(int), but the result would be implementation defined, which is not desirable under such basic circumstances. Essentially, int in the signature of getchar became "contagious," getting into signatures of functions only marginally related to it.

Community
  • 1
  • 1
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • 2
    What would "ignore" mean in this context? The functions are required to accept EOF and return `false` for it. – n. m. could be an AI Nov 21 '16 at 10:24
  • means return that EOF is not space – Nick Nov 21 '16 at 10:47
  • 2
    Also, you have answered a slughtly different question, "why the argulent type of isalpha is int". The actual question is "why isalpha must be defined for EOF". For any other int value which does not fit unsigned char the result is undefined. – n. m. could be an AI Nov 21 '16 at 10:55
  • 1
    The EOF marker [*is* mentioned in the POSIX standard](http://pubs.opengroup.org/onlinepubs/009695399/functions/islower.html) and the [C language standard](http://port70.net/~nsz/c/c11/n1570.html#7.4) as well. The ctype functions *are* required to handle EOF without causing undefined behavior, so you're not answering the question. – nwellnhof Nov 21 '16 at 11:12
  • @nwellnhof You have no clarity on what's undefined behavior, and how it is different from implementation-defined behavior. – Sergey Kalinichenko Nov 21 '16 at 11:45
  • I think that the "contagious" behavior is not the primary reason. I think the things are more rational. See for example http://stackoverflow.com/a/40732303/1487773/ – Igor Liferenko Nov 22 '16 at 05:14