24

The strchr function in the C standard library looks for a char in a string, but its signature takes an int for the search character. In these two implementations I found, the implementation casts this int to a char:

char *strchr(const char *s, int c) {
    while (*s != (char)c) 
        if (!*s++)
            return 0; 
    return (char *)s; 
}

char *strchr(const char *s, int c) {  
    while (*s && *s != (char)c)
       s++;
    if (*s == c)  
      return (char *)s;
    return NULL;
}

Does anyone know why? Why not just take a char as a parameter?

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
Steve D
  • 879
  • 1
  • 8
  • 11
  • same reason for why [memset take an int instead of a char](https://stackoverflow.com/q/5919735/995714) – phuclv May 27 '17 at 16:15

4 Answers4

43

The reasons for that are purely historical. Note, that in the old days of C language (K&R C) there was no such thing as function prototype. A strchr function in those times would be declared as

char *strchr();

and defined in K&R style as

char *strchr(s, c)
  char *s;
  char c;
{
  /* whatever */
}

However, in C language (in K&R C and in the modern one as well) if the function is declared without a prototype (as shown above), the parameters passed in each function call are subjected to so called default argument promotions. Under default argument promotions any integral type smaller than int (or unsigned int) is always converted to int (or unsigned int). I.e. when the parameters are undeclared, whenever you pass a char value as an argument, this value is implicitly converted to int, and actually physically passed as an int. The same is true for short. (BTW, float is converted to double by default argument promotions). If inside the function the parameter is actually declared as a char (as in the K&R style definition above), it is implicitly converted back to char type and used as a char inside the function. This is how it worked in K&R times, and this actually is how it works to this day in modern C when function has no prototype or when variadic parameters are used.

Now, cue in the modern C, which has function prototypes and uses modern-style function definition syntax. In order to preserve and reproduce the "traditional" functionality of strchr, as described above, we have no other choice but to declare the parameter of strchr as an int and explicitly convert it to char inside the function. This is exactly what you observe in the code you quoted. This is exactly as the functionality of strchr is described in the standard.

Moreover, if you have an already-compiled legacy library, where strchr is defined in K&R style as shown above, and you decided to provide modern prototypes for that library, the proper declaration for strchr would be

char *strchr(const char *s, int c);

because int is what the above legacy implementation expects to physically receive as c. Declaring it with a char parameter would be incorrect.

For this reason, you will never see "traditional" standard library functions expecting parameters of type char, short or float. All these functions will be declared with parameters of type int or double instead.

A very same rationale is behind the standard guarantee that char pointers and void * pointers share the same representation and alignment requirements. Relying on this guarantee you can declare malloc as a void *-returning function and then use this declaration with a pre-compiled legacy version of standard library where malloc actually returned char *.


Reference: the C99 rationale, version 5.10

7.1.4 Use of library functions
/--/
All library prototypes are specified in terms of the “widened” types: an argument formerly declared as char is now written as int. This ensures that most library functions can be called with or without a prototype in scope, thus maintaining backwards compatibility with pre-C89 code

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • Ok. many thanks for the detailed reply. Basically, it seems to boil down to legacy issue / historical precedent. I am curious no more... – Steve D Mar 06 '10 at 23:20
  • it's not a historical reason -- it has to do with efficiency of implementation of how parameter passing is implemented for function calls, and how the default rules for type promotion work for expressions. It's all still _very_ relevant today. The answer above misleads considerably. The abstract machine model for C makes `int` the most optimal size for a reason, and smaller objects are best handled by promoting them to `int`. Care should always be taken to declare function parameters using their natural promoted types, _not_ any narrower type one might often call them with. – Greg A. Woods Nov 27 '12 at 01:11
  • 1
    @Greg A. Woods: That completely misses the point. The prototype-less declarations of standard functions in K&R C and their behavior with regard to smaller types are indisputable historical fact. The reasons it was done that way in K&R (efficiency or something else) are interesting, but irrelevant within the context of the question. The answer to the original question is, again, that when the concept of "prototype" was added to the language, the prototypes for standard functions were *forced* to avoid smaller types in order to preserve the already established legacy behavior. That's all. – AnT stands with Russia Nov 27 '12 at 02:05
  • 1
    @Greg A. Woods: The assertion that "care should always be taken to declare function parameters using their natural promoted types" makes no sense whatsoever. On the contrary, this happens to be one of the efficiency matters that is fully and easily handled by the implementation (by the compiler). Function declarations should use types that describe the author's intent as closely as possible. If the intent calls for smaller type, the smaller type has to be used. A modern compiler will easily perform the promotion (an un-promotion) under the hood, if needed for better efficiency. – AnT stands with Russia Nov 27 '12 at 02:09
  • All of the compilers I work with these days (not so many as it once was, sadly) all give warnings for any attempt to pass a `char` or `_Bool` to a function even when a prototype is in scope, for the reasons I note in my comment on John Knoeller's answer and I also note there why not using the naturally promoted types is very error prone. Not paying heed to how the C abstract machine prefers `int` naturally has caused many a programmer a nasty headache, myself included. – Greg A. Woods Nov 27 '12 at 02:24
  • @Greg A. Woods: That must be very lazy compilers, if they did not bother to implement this straightforward and simplistic optimization "under the hood" (assuming this optimization makes sense on given hardware). I would understand compilers that issue such warnings when no prototype is in sight, but issuing this warning for an explicitly prototyped function is simply unforgivable. I'd see it as a sign of poor compiler quality. Most compilers I work with do pass `char` and `short` parameters as `int` values internally, but they don't see a reason to make an issue out of it. Neither do I. – AnT stands with Russia Nov 27 '12 at 04:41
  • In any case, this matter has no relation to the original question and the correctness of the answer. There's simply no way around what I stated in my answer, regardless of the compilers stance on the "small" parameter passing conventions. – AnT stands with Russia Nov 27 '12 at 04:43
  • as I said, the true explanation for the reason for the likes of declaring a parameter as an int instead of anything smaller are not due to any form of historical accident -- the reasons are all still current and relevant and tied to the very definition of the underlying abstract C machine and the rules for how a function call behaves when there's no prototype in scope. In the case of strchr() the reason is even simpler: `'a'` is indeed of type `int` as @John Knoeller clearly and simply states. – Greg A. Woods Nov 28 '12 at 00:13
  • 2
    @Greg A. Woods: I never said it was an accident. The reasons the old version (K&R) of the language was designed that way certainly *do* exist, but they are irrelevant within the context of this question. This question is: why does the *new* version of the language (the one with prototypes and typed parameter lists), does it that way as well? The answer: it has no choice, since it has to preserve compatibility with the old version of standard library. Again: modern implementations have no *no choice*. In "no choice" situations all other reasons are irrelevant. – AnT stands with Russia Nov 28 '12 at 00:58
  • As for the type of character constant in C being `int`... It is true and it is is an interesting *side remark*. But it has absolutely no value as an answer to the above question. It is not an answer at all and referring to it as "a reason for `strchr` prototype", as you do, is... bizarre, to put it mildly. – AnT stands with Russia Nov 28 '12 at 01:01
  • 1
    you said it was "purely historical", as if to imply it would not be the case today for some reason, but it is not merely historical as it would still be the case today. Every external function should remain compatible with the way a call may be made without a prototype in scope since prototypes are still not absolutely required. Maybe if the predicted obsolescence of allowing function calls without prototypes in scope comes to be, then, and only then, will the ongoing declaration of strchr() taking an `int` as its second parameter be "purely historical". Also: find the a: `strchr(p, 'a')` – Greg A. Woods Nov 28 '12 at 01:27
  • Correction: parameters passed to such a function are subject to the _default argument promotions_ (C11 6.5.2.2), that are somewhat different from the usual arithmetic conversions. "If the expression that denotes the called function has a type that does not include a prototype, the integer promotions are performed on each argument, and arguments that have type float are promoted to double. These are called the default argument promotions." – Lundin Feb 03 '17 at 08:59
  • @AnT This question has popped up again [here](http://stackoverflow.com/questions/42018949/creating-a-simplified-version-of-strchr). Do you per chance have any reference that states that the reason `strchr` takes `int` was compatibility with non-prototype implementations? I haven't found any evidence for this by reading through the C89 rationale, so I assumed the reason was compatibility with `EOF` (which is definitely the reason ctype.h functions use `int`). Also I found some pre-ANSI correspondence by Dennis Ritchie where the function was already using `int`, before prototypes even existed. – Lundin Feb 03 '17 at 09:47
  • @Lundin The rationale for ISO C99 gives some reasons in 6.5.2.2, though `strchr` is not mentioned there because it doesn't differ from all the other affected functions. – Roland Illig Feb 03 '17 at 22:43
  • Okay so there is a valid reference in 7.1.4 of the C99 rationale. I took the liberty to add this reference to this answer through an edit. – Lundin Feb 06 '17 at 07:55
3

In c the type of a character literal is int. For example: 'a' is of type int.

John Knoeller
  • 33,512
  • 4
  • 61
  • 92
  • 1
    If the prototype explicitly asked for a char, one would need to explicitly cast 'a' to a char: `(char)'a'`. That is of course, not optimal. – Spidey Sep 14 '12 at 14:07
  • furthermore, even if you write `(char)'a'`, you may still end up with the result of that expression being promoted to `int`, and then only possibly being re-converted back to the type given in the prototype (implementations are free to perform the usual integer promotions on function parameters even when a prototype is in scope in order to optimize the calling sequence). For variadic functions the default argument promotions are performed on all trailing arguments. It is also _not_ an _error_ to have no prototype in scope, and the default argument promotions are again performed. – Greg A. Woods Nov 27 '12 at 01:38
3

I think this can be attributed to nothing more than an accident of history. You're exactly right that char seems the obvious data type to use for the character being searched for.

In some situations in the C library, such as the getc() function, an int value is returned for the character read from input. This is not a char because an extra non-character value (EOF, usually -1) can be returned to indicate the end of the character stream.

The EOF case doesn't apply to the strchr() function, but they can't really go back and change the declaration of the function in the C library now.

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
-1

int c is the character that you want to search. The character is passed as an integer, but in fact only the lower 8 bits are searched. It should therefore be handed over to a char

The strchr function looks like this:

char *strchr(const char *s, int c){
    while (*s != (char)c)
        if (!*s++)
            return 0;
    return (char *)s;
}

As you can see there is a cast of int c to (char)c.

Now to Answer to your question, your char ch it is converted to an integer int c and applied as the ordinal value of a character.

So the following program should be OK:

#include<stdio.h>
#include<string.h>

int main(void){
    char *name = "Michi";
    int c = 99; /* 99 is the ANSI code of c*/

    char *ret = strchr(name, c);

    printf("String after %s\n", ret);
    return 0;
}

But the following not:

#include<stdio.h>
#include<string.h>

int main(void){
    char *name = "Michi";
    char c = '99'; /* 99 is the ANSI code of c*/

    char *ret = strchr(name, c);

    printf("String after %s\n", ret);
    return 0;
}

Because of multi-character character constant which is overflow in implicit constant conversion

Michi
  • 5,175
  • 7
  • 33
  • 58
  • I would like to know why I get down voted again and again without an Explanation. – Michi Feb 03 '17 at 23:54
  • This doesn't address the question, which is: *why* the function has an `int` parameter in the first place instead of a `char`. Also there is no such thing as "ANSCI". – M.M Apr 07 '18 at 01:20
  • @M.M if you read the whole Question the OP needs to know and I will quote it “Why not just take a char as a parameter?” – Michi Apr 07 '18 at 11:09
  • Yes, and your answer does not address that question. You only talk about how the function which takes `int` parameter works internally – M.M Apr 07 '18 at 14:03