14

The other day I created a post over at CodeReview. One person who answered my question suggested that I refrain from using strcasecmp() because the "function is non-standard [and] this makes [my] code non-portable." This is how I used it:

int playGame()
{

    char scanned[3];
    printf("Do you wish to play tick-tack-toe?\n");
    scanf("%s", scanned);
    if(strcasecmp(scanned,"yes")==0)
        startGame();

    else
    {
        if (strcasecmp(scanned,"no")==0 || strcasecmp(scanned,"nah")==0 || strcasecmp(scanned,"naw")==0)
        {
            printf("That's too bad!/nThis program will now end.");
            return 1;
        }
        printf("Not valid input!/nThis program will now end.");
        return 1;
    }
return 0;
}

Can someone explain more in-depth and why strcasecmp() has these limitations?

SuperGoA
  • 225
  • 1
  • 2
  • 9

4 Answers4

7

strcasecmp is not in the C or C++ standard. It's defined by POSIX.1-2001 and 4.4BSD.

If your system POSIX or BSD compliant, you'll have no problems. Otherwise, the function will be unavailable.

rost0031
  • 1,894
  • 12
  • 21
  • Further, there isn't a more portable alternative to `strcasecmp()`; you might find `stricmp()` is available on some platforms, but that too is not very standard. If you need the code to be portable, write your own function to do the case-insensitive comparison. Note that names starting `str` followed by a lower-case letter are reserved by the C standard. – Jonathan Leffler Jun 29 '15 at 23:42
  • Does this mean that strcmp() is also non-standard? "Note that names starting str followed by a lower-case letter are reserved by the C standard." Don't you mean aren't? Because strcasecmp() starts with "str" and is followed by a lower-case letter. – SuperGoA Jun 29 '15 at 23:59
  • 2
    That's a complete misinterpretation. Read carefully: "Names starting with str followed by a lower-case letter are reserved by the C Standard". That means if _you_ write a function named strcasewhatever, then _you_ have just left the C Standard. strcmp is _part_ of the C Standard. – gnasher729 Jun 30 '15 at 00:04
  • 2
    @SuperGoA: I meant what I said. The C standard defines `strcmp()` as one of its functions, and also reserves other names that start with `str` and a lower-case letter. POSIX steps on that reserved name space with `strcasecmp()`, but gets away with it. If you create a function `strcasecmp()`, you are not guaranteed to get away with it, but `strCaseCmp()` or `str_casecmp()` or `str9cmp()` is OK (`str` is not followed by a lower-case letter for any of those three). Of course, you still might run into problems if the system isn't as disciplined as it is supposed to be. – Jonathan Leffler Jun 30 '15 at 00:06
  • 3
    Note that [`strcasecmp()`](http://pubs.opengroup.org/onlinepubs/7990989775/xsh/strcasecmp.html) was defined by the Single Unix Specification v2 in 1997. I'm not certain it was in POSIX at that time, but `strcasecmp()` has an even longer history than implied by the 2001 reference. – Jonathan Leffler Jun 30 '15 at 00:12
  • 2
    It's less about the platform than it is about the C compiler and its libraries. You can install MinGW on Windows and call strcasecmp just like on Unix or OS X, even though it won't work if you're using Visual Studio. Or you can just `#define strcasecmp stricmp`, or vice-versa, as needed for the toolset. – Dan Korn Jun 30 '15 at 00:13
  • This makes a lot more sense, thanks! To make sure I'm fully getting it this means that if publishers/developers of the C Standard want to claim the right to create a strcasecmp() function then POSIX would no longer claim its ownership? – SuperGoA Jun 30 '15 at 00:15
  • 1
    Well, the folks on the C standards committee could decide to take any function from another library, even one as old as POSIX or BSD, and add it to the C standard. But the "ownership" of the function is not something that developers generally need to worry too much about. – Dan Korn Jun 30 '15 at 00:18
  • 1
    If the function works with whatever toolset you're compiling it on, and on whatever platform the code is running on, then you really don't need to worry too much. If you port to another platform, this is going to be the least of your worries. – Dan Korn Jun 30 '15 at 00:19
  • 1
    @Dan Korn, this made it click in my brain! Thank you everyone :) – SuperGoA Jun 30 '15 at 00:20
  • 2
    The thing that you *might* need to worry about is, how does the runtime implementation of the function figure out which uppercase and lowercase characters are equivalent for its case-insensitive comparison? You might find that things all work fine for English strings, but if you give it strings with accents or diacriticals from other languages, such as German, French, Spanish, then whether it works may depend on the implementation, or on the "C" locale. Then there are wide-character languages like Japanese and Chinese; that's a whole 'nother story. Welcome to the Tower of Babel! – Dan Korn Jun 30 '15 at 00:25
6

Short answer: As strcasecmp() is not in the C standard library, that make it non-standard.

strcasecmp() is defined in popular standards such as 4.4BSD, POSIX.1-2001.

The definition of case-less functions opens the door to the nit-picky details. These often involve the positive or negative result of case-less compares, not just the 0 or non-0 as used by OP. In particular:

In the POSIX locale, strcasecmp() and strncasecmp() shall behave as if the strings had been converted to lowercase and then a byte comparison performed. The results are unspecified in other locales.

The trouble with this is with upper and lower case letters that do not have a 1 to 1 mapping. Consider a local that has E, e and é but no É, yet toupper('é') -- > 'E' . Then with "as if the strings had been converted to lowercase", 'E' has 2 choices.

As a candidate portable solution consider one that round trips the letter (to upper then to lower) to cope with non 1-to-1 mappings:

int SGA_stricmp(const char *a, const char *b) {
  int ca, cb;
  do {
     ca = * (unsigned char *)a;
     cb = * (unsigned char *)b;
     ca = tolower(toupper(ca));
     cb = tolower(toupper(cb));
     a++;
     b++;
   } while (ca == cb && ca != '\0');
   return ca - cb;
}

If you do not want to round-trip the values use:

     ca = tolower(ca);
     cb = tolower(cb);

Detail: toupper() and tolower() only defined for int in the range of unsigned char and EOF. * (unsigned char *)a used as *a may have negative values.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
1

"The function is non-standard" means, that the function declaration and contract aren't specified in The C International Standard.

"This makes code non-portable" means, that implementations aren't required to implement strcasecmp(), and therefore your code is not fully standard-compliant and not guaranteed to be compiled by strictly standard-conforming compilers.

strcasecmp() is itself a part of the POSIX.1-2001 and 4.4BSD specifications (link).

набиячлэвэли
  • 4,099
  • 4
  • 29
  • 40
  • I'm sorry, but I don't understand this: "This makes code non-portable means, that implementations aren't required to implement strcasecmp()..." Why would implementations NOT be required if strcasecomp() is non-standard. It sounds, based on the first answer, that something 'special' (POSIX.1-2001 or 4.4BSD) is needed for it work. – SuperGoA Jun 30 '15 at 00:09
  • 1
    He means that, if the function is not part of the C standard, then a C compiler that's conformant to the standard isn't required to implement it. So your code may not compile on every "standard" C compiler in the world. – Dan Korn Jun 30 '15 at 00:21
  • 1
    @SuperGoA The International C Standard comes as a SINGLE document and describes literally EVERYTHING, that a standard-conformant C compiler MUST/MUST NOT/CAN/CAN NOT/SHOULD/SHOULD NOT do. A strictly-conforming C compiler will implement ONLY things from the standard. `POSIX` is, well, not a part of the standard, but a widely-accepted extension(specification). On most Windowses it's simply not implemented. – набиячлэвэли Jun 30 '15 at 01:09
-1

An alternative would be to canonize the input to lower case using tolower(), which is standard. Then you could use the standard strcmp().

  • 1
    Note that this can be functionally different than `strcasecmp()` when upper/lower case letters do not have 1-to-1 mappings. It depends on `strcasecmp()` specifications (it is not C standard) . Most `strcasecmp()` do use this `tolower()` approach for the default locale. – chux - Reinstate Monica Sep 21 '17 at 20:40