3

I am perfectly aware of the mechanism behind the switch statement and why an integer constant is required. What I don't undestand is why the following case label is not considered an integer constant. What is it then? A non-existing variable? Can anyone categorize it? Does the C compiler really need to be so dumb?

struct my_struct {
    const int my_int;
};

switch (4) {

case ((struct my_struct) { 4 }).my_int:

    printf("Hey there!\n");
    break;

}

And of course…

error: case label does not reduce to an integer constant
  case ((struct my_struct) { 4 }).my_int:

EDIT to answer to Eugene's comment:

What's you real use case? If it is an integer constant, why to make it so complicated?

I was trying to find a clever hack to switch between two-character strings, using a union instead of a struct, as in the following example:

#include <stdio.h>

union my_union {
    char my_char[sizeof(int)];
    int my_int;
};

void clever_switch (const char * const my_input) {

    switch (((union my_union *) my_input)->my_int) {

    case ((union my_union) { "hi" }).my_int:

        printf("You said hi!\n");
        break;

    case ((union my_union) { "no" }).my_int:

        printf("Why not?\n");
        break;

    }

}

int main (int argc, char *argv[]) {

    char my_string[sizeof(int)] = "hi";

    clever_switch(my_string);

    return 0;

}

…Which of course doesn't compile.

On my machine ((union my_union) { "hi" }).my_int is 26984 while ((union my_union) { "no" }).my_int is 28526. However I cannot write these numbers myself, because they depend on the endianness of the machine (so apparently my machine is little-endian). But the compiler knows about the latter and knows exactly during compile time what number ((union my_union) { "no" }).my_int is going to be.

The annoying thing is that I can already do it, but only using a very obscure (and slightly less efficient) syntax. The following example compiles just fine:

#include <stdio.h>

void clever_switch (const char * const my_input) {

    #define TWO_LETTERS_UINT(FIRST_LETTER, SECOND_LETTER) ((unsigned int) ((FIRST_LETTER) << 8) | (SECOND_LETTER))

    switch (TWO_LETTERS_UINT(my_input[0], my_input[1])) {

    case TWO_LETTERS_UINT('h', 'i'):

        printf("You said hi!\n");
        break;

    case TWO_LETTERS_UINT('n', 'o'):

        printf("Why not?\n");
        break;

    }

    #undef TWO_LETTERS_UINT

}

int main (int argc, char *argv[]) {

    clever_switch("hi"); /* "You said hi!" */
    clever_switch("no"); /* "Why not?" */

    return 0;

}

So the question remains: does the C compiler (or the C Standard in this case) really need to be so dumb?

madmurphy
  • 1,451
  • 11
  • 20
  • 1
    What's you *real* use case? If it is an integer constant, why to make it so complicated? – Eugene Sh. Apr 08 '19 at 19:10
  • Possible duplicate of [Why in C a const object is not a compile-time constant expression?](https://stackoverflow.com/questions/40062767/why-in-c-a-const-object-is-not-a-compile-time-constant-expression) – Mark Benningfield Apr 08 '19 at 19:22
  • For the most part, the C standard does not make compilers work with objects during compilation. Any compile-time evaluation just requires using simple values. In contrast to simple values, an object is, by definition, a region of storage in which the bytes represent a value. C is, or at least originally was, designed to be portable and small. One feature is cross-compilation: A compiler can be designed to run on one architecture and compile programs for execution on another architecture. The C standard has various provisions for that, such as separate character sets… – Eric Postpischil Apr 08 '19 at 19:34
  • … for compilation and execution. So consider that a C compiler that has to allow you to put things in a struct and then extract them has to emulate how they would go into a struct on the target system, not on the native system. That may seem simple for simply putting an `int` in and taking it out. But C expressions can get much more complicated. You might put an `int` in and then take one byte out, using a conversion to `char *`. We are not going to require C compilers to emulate that. Sure, it could fudge things to make it work for simple `int` operations like yours, but the C standard… – Eric Postpischil Apr 08 '19 at 19:36
  • 1
    … would have some trouble explaining what particular things had to be supported in this way. It is simpler not to support that sort of thing. And it suffices—C has been enormously successful without this feature. And even if you do not fiddle with bytes, consider what happens if you define a huge array, initialized with some values, and pluck one of them out to use as a “constant”. Now you are forcing the compiler to manipulate large amounts of data at compile time. – Eric Postpischil Apr 08 '19 at 19:37
  • @EugeneSh. I have edited my question to answer to your comment. – madmurphy Apr 08 '19 at 19:43

5 Answers5

5

While the expression ((struct my_struct) { 4 }).my_int indeed is evaluated at run time to 4, it is not constant. The switch-case needs a constant expression.

I see that you have declared my_int as const. But that only means that it can't be modified later. It doesn't mean that the expression ((struct my_struct) { 4 }).my_int is constant.

If you use if-statement instead of switch-case, you will be fine.

if (((struct my_struct) { 4 }).my_int == 4) {
    printf("Hey there!\n");
}
VHS
  • 9,534
  • 3
  • 19
  • 43
  • I guess the OP is claiming that this expression *can* be evaluated at the compile time. – Eugene Sh. Apr 08 '19 at 19:20
  • 1
    In C99 and later, I believe `((struct my_struct) { 4 }).my_int` _does_ qualify as a _constant expression_ in the sense that the standard uses that term -- but not as an _integer constant expression_, which is what is required for `switch` cases. – zwol Apr 08 '19 at 19:21
  • @zwol, I just tried OP's code with C99 compilation option. I get the same error. – VHS Apr 08 '19 at 19:28
  • 2
    @VHS I am only nitpicking your wording. OP's code is invalid in all versions of the C standard, but not because `((struct my_struct) { 4 }).my_int` isn't a constant. It _is_ a constant [expression], but it is not an _integer constant expression_, which is a restricted subcategory of constant expressions. Case labels are required not just to be constant, but to be integer constant expressions. – zwol Apr 08 '19 at 19:48
4

A case label in a switch statement requires an integer constant expression, which is defined as:

An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, _Alignof expressions, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof or _Alignof operator.

The expression ((struct my_struct) { 4 }).my_int does not qualify as an integer constant expression by this definition, even though it is an integer-valued expression whose value could be determined at compile time.

Tom Karzes
  • 22,815
  • 2
  • 22
  • 41
  • 2
    Yours is a bit a tautological answer. It's a bit like saying “It can't be done because the standard says so, even if theoretically it *could* be done”. My question though was “Is it really needed that it can't be done?”… – madmurphy Apr 08 '19 at 20:24
  • You asked why it isn't considered an integer constant (expression), which I have answered. If you instead asked "why couldn't the definition of an integer constant expression in the C standard be extended to include this case", then my answer would be, it could, but it would complicate the standard and compiler implementations, and presumably this situation isn't considered important enough to justify the added complexity. – Tom Karzes Apr 08 '19 at 20:32
  • Ok, fair enough! – madmurphy Apr 08 '19 at 20:34
2

It's a lowest common denominator thing.

The C standard says ((struct my_struct) { 4 }).my_int doesn't satisfy the constraints imposed on case labels (namely that they be integer constant expressions), so compliant C compilers aren't required to be smart enough to be able to optimize it.

The standard doesn't prohibit a compiler from optimizing it. Indeed, optimize it is what clang does.

Your program:

#include <stdio.h>
struct my_struct {
    const int my_int;
};

int main()
{
    switch (4) {

        case ((struct my_struct) { 4 }).my_int:

            printf("Hey there!\n");
            break;

    }
}

just works on clang, though you will get a warning if you compile it with -pedantic.

In other cases, like when distinguishing between VLAs and regular arrays, the distinction between integer constant expressions and other integer expressions also affects other constructs like switch or goto-based jumps which become prohibited if they jump into the scope of a VLA. Again, a compiler can fold it and allow such jumps as long as at least one diagnostic is made (clang warns about the folding, not the jump).

If you do use these constructs and your compiler won't stop you, your program won't be portable.

Finally the compile-time constness of integers can also affect types in a certain case.

The Linux kernel, I believe, uses something similar to

#define IS_CEXPR(X) _Generic((1? (void *) ((!!(X))*0ll) : (int *) 0), int*: 1, void*: 0)

to detect integer constant expressions (which, I hear, was part of a mission to weed out VLAs).

This is based on the C-standard rule that an integer constant expression equal to 0 cast to (void*) is a null-pointer-constant whereas a regular integer expression cast to (void*) is just a void pointer, even if the value of the expression is known to be 0. The rules for determining the type of a ternary then distinguish between (void*) expressions and the null pointer constant, resulting in (1? (void *) ((!!(X))*0ll) : (int *) 0) being typed int * if X is a integer constant expression and void * otherwise.

Most compiler probably won't let you get around type system violations (especially inside _Generic) so easily.

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • 1
    @madmurphy You're welcome. zwol sheds more light on the origins. I wrote more about how integer constant expressions interact with the lang as it is now. It is kind of interesting how the originally semi-arbitrary decision on what needs or doesn't need to be evaluatable at compile time became a concept so interwoven into the language that none of the major compiler of today even implements it fully correctly: https://twitter.com/pskocik/status/1076768533869146112 . You might be making a good point about the need for reinvention, I think. – Petr Skocik Apr 08 '19 at 21:02
1

Regarding the question of why the C standard doesn't allow a compiler to accept

case ((struct my_struct) { 4 }).my_int:

... we can't answer this with certainty, because nobody here is on the C committee (as far as I know, anyway) and this design decision was made something like 30 years ago, so there's a decent chance nobody who was there remembers the rationale.

But we can say these things:

  1. The original 1989 C standard intentionally left out any number of features that could have been implemented but only at significant cost in implementation complexity, compile-time memory requirements, etc. For instance, the original rationale for the distinction between "constant expression" and "integer constant expression" in the standard was that the compiler shouldn't ever need to do floating-point arithmetic at compile time.

    The feature you're asking for is roughly as difficult to implement as

    static const int CONSTANT = 123;
    ...
    switch (x) { case CONSTANT: ... }
    

    which is also not required to work in C (although it is in C++).

  2. Additions to the C standard since 1989 have been relatively small and only in response to substantial demand. In particular, "it is no longer expensive to implement this feature" is not considered enough of a reason, as far as I can tell.

That's the best answer I can give you.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • 1
    It looks like the most likely answer, thanks. This convinces me more and more that the C language is so beautiful that it *deserves* to be reinvented from scratch. – madmurphy Apr 08 '19 at 20:51
0

Well. Why it's not possible looks to have been thoroughly explained... ...I'll just leave this here, then.

case ((int)((struct my_struct) { 4 }).my_int):

clang 9, arch-linux, x86_64

l.k
  • 199
  • 8