3

The following code is rejected by GCC and Clang (godbolt link):

struct thing;

typedef enum {
  THING_TYPE_A,
  THING_TYPE_B,
} thing_type_t;

typedef struct thing_a {
  int i;
} thing_a_t;

typedef struct thing_b {
  struct thing const *t;
} thing_b_t;

typedef struct thing {
  thing_type_t type;
  union {
    thing_a_t a;
    thing_b_t b;
  } t;
} thing_t;

thing_t const *get_thing(void) {
  static const thing_t s_thing = {
    .type = THING_TYPE_B, 
    .t = {
      .b = { 
        .t = &(thing_t) { .type = THING_TYPE_A, .t = { .a = { .i = 234 } } } 
      }
    },   
  };

  return &s_thing;
}

cppreference's page on Compound Literals says:

The unnamed object to which the compound literal evaluates has static storage duration if the compound literal occurs at file scope and automatic storage duration if the compound literal occurs at block scope (in which case the object's lifetime ends at the end of the enclosing block).

I believe that explains the compile error; the anonymous thing_t whose address is used to initialize s_thing.t.b.t has automatic storage duration and is therefore not a compile-time constant. If s_thing is moved to file-scope, both Clang and GCC accept it. (There is more discussion at this SO question)

It looks like C23 will expand this by allowing constexpr to be specified inside the compound literal parentheses, which is a welcome improvement!

In the meantime, is there any way to achieve a declaration like s_thing (that is, the initialization of a static const struct that contains a pointer to another constant variable) in pre-C23 at block scope, without having to explicitly declare the anonymous thing_t as its own separate variable?

Charles Nicholson
  • 888
  • 1
  • 8
  • 21
  • 1
    Using the cast `(static const thing_t)` lets at least GCC of your Godbolt compile without error. Clang does not like the `static`. – the busybee Mar 10 '23 at 07:19
  • 1
    In general this sounds like a solid case for the KISS principle, [as recommended by Brian Kernighan](https://www.azquotes.com/picture-quotes/quote-debugging-is-twice-as-hard-as-writing-the-code-in-the-first-place-therefore-if-you-write-brian-kernighan-66-91-06.jpg). I'd rethink this whole design and look for ways to make it less contrived. – Lundin Mar 10 '23 at 09:50
  • @Lundin I don't see it as "contrived" to initialize an immutable data structure with values known at compile-time, so that the compiler can emit a simple blob of bytes into the .rodata section. That's pretty common, and explicitly and trivially supported by the C language for non-recursive data structures. Also this feature works perfectly fine at file-scope, so I reject it being a contrivance when I wish to "tuck it away" inside of a function. – Charles Nicholson Mar 10 '23 at 12:22
  • Union type punning where one of the struct members is a self-referencing pointer is quite exotic. If not for that, it would seem that the whole thing could be replaced with a plain integer array. As for storing this in the .rodata section, that is exactly what will happen if you use the example provided by dbush with a separate named variable. But for reasons unknown, you don't want that. Also... "so that the compiler can emit a simple blob of bytes into the .rodata section" is blocked by your union, because there will be padding bytes in case a pointer and an int have different sizes. – Lundin Mar 10 '23 at 12:39
  • "Type punning" is a very specific term that you are mis-using here, and it refers to intentionally violating strict aliasing rules to extract or exploit bit representations of objects. You can read about it here: https://en.wikipedia.org/wiki/Type_punning – Charles Nicholson Mar 10 '23 at 12:43
  • What I'm doing is a simple discriminated union, and your claim that using such is blocked by a compiler is also false; see this for my exact example here, simply moved up to file scope: https://gcc.godbolt.org/z/zocnhrjPb – Charles Nicholson Mar 10 '23 at 12:44
  • Additionally, this example has the semantics stripped out; you can imagine that these discriminated union sub-types have meaningful data. It's a little silly to attack an intentionally stripped-down example for being too simple. – Charles Nicholson Mar 10 '23 at 12:45
  • @CharlesNicholson No, what you are using here is known as union type punning and it does _not_ violate strict aliasing in C. As for "blocking" I meant that on a 64 bit system with 64 bit pointers and 32 bit int, the int version will get stored as 32 bits data 32 bit padding. So you will not get consecutive data allocation in .rodata, in fact you'll only utilize 50% of it while filling the other 50% with garbage. – Lundin Mar 10 '23 at 12:49
  • @Lundin please read these two articles to learn the different between union type punning and tagged (or discrimintaed) unions: https://en.wikipedia.org/wiki/Type_punning and https://en.wikipedia.org/wiki/Tagged_union. You are mis-using important terminology! Union-based type punning in C and C++ is UB type-system subversion that arises from reading a different field of a union that was previously written to. Simply storing multiple orthogonal fields inside a union and using an enum to identify which one was most recently written to is a tagged (or discriminated) union. – Charles Nicholson Mar 10 '23 at 12:54
  • I also never made any claim about optimal packing; you used the word "blocked" which I interpreted to mean "made impossible". In this case I'm happy to consume the extra unused storage space consumed by the largest-possible union field, but thanks for flagging it; it's important. – Charles Nicholson Mar 10 '23 at 12:55
  • 1
    @CharlesNicholson Sigh. Please read the actual C standard ISO 9899:2018 6.3.5.2 foot note 97: _"If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called “type punning”). This might be a trap representation."_ If you don't like it, you may continue to argue about terminology with the ISO 9899 working group. – Lundin Mar 10 '23 at 12:57
  • 1
    Also as per the quoted part above as well as 6.3.5.2 and 6.2.6, union type punning is _not_ UB in C but it is in C++. – Lundin Mar 10 '23 at 13:00
  • Great, we agree! Now please point to where in my example I am using type punning... – Charles Nicholson Mar 10 '23 at 13:03

2 Answers2

2

In the meantime, is there any way to achieve a declaration like s_thing (that is, the initialization of a static const struct that contains a pointer to another constant variable) in pre-C23 at block scope, without having to explicitly declare the anonymous thing_t as its own separate variable?

No, you have effectively ruled out all the possibilities.

  • An initializer for an object with static storage duration may contain only constant expressions.

  • For an object or sub-object of pointer type, the corresponding initializer element, if any, must be specifically an address constant, which is either a null pointer constant, or an integer constant expression cast to pointer type, or a pointer to an object having static storage duration, or a pointer to a function.

  • the only objects with static storage duration but no associated identifier are the arrays to which string literals correspond and compound literals appearing at file scope.

And I don't think it's different in C23 constexpr helps. Yes, you can use constexpr in the declaration of a compound literal to get a "compound literal constant" of structure type, but as far as I can tell, that does not confer static storage duration on said object. And if the object doesn't have static storage duration then its address is not an address constant.

However, C23 does allow you to specify storage class static for a compound literal appearing at block scope, and that has the effect one would expect: the compound literal has static storage duration. In that case, its address is an address constant, and can be used in the initializer of another object with static storage duration.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • Oh no, I was really hoping that the point of `constexpr` in the C23 spec was to fix this! I don't think I understand the point of an automatic `constexpr` object... – Charles Nicholson Mar 10 '23 at 12:26
  • @CharlesNicholson C23 will also add storage class specifiers to compound literals. Try this out: https://godbolt.org/z/vzff7hY1G – Lundin Mar 10 '23 at 12:46
  • Yes, thanks, if you re-read my question you'll see where I mention that explicitly, before asking if there's a way to do this in C99 :) Additionally, @John Bollinger's response, to which you're replying, casts doubt on the storage lifetime of such compound literals... – Charles Nicholson Mar 10 '23 at 12:50
  • 1
    @CharlesNicholson, I have revised my comments about C23. I still think that `constexpr` does not have the effect you supposed it would, but in C23 you have the option of using the `static` storage class specifier in a block-scope compound literal declaration to give the literal static storage duration. – John Bollinger Mar 10 '23 at 15:31
  • Thanks for the clarification, adding `static` seems to fully accomplish the goal! Now to wait patiently until C23 has widespread adoption in the embedded systems space... – Charles Nicholson Mar 10 '23 at 15:39
1

You're correct as to the reason for the error: compound literals at block scope always have automatic storage duration, even if you attempt to use them to initialize a static object.

If the main motivation is to not allow s_thing to be visible by that name as a global object, you can define it as a static file scope variable in a source file by itself along with get_thing to return its address.

thing.h:

struct thing;

typedef enum {
  THING_TYPE_A,
  THING_TYPE_B,
} thing_type_t;

typedef struct thing_a {
  int i;
} thing_a_t;

typedef struct thing_b {
  struct thing const *t;
} thing_b_t;

typedef struct thing {
  thing_type_t type;
  union {
    thing_a_t a;
    thing_b_t b;
  } t;
} thing_t;

thing_t const *get_thing(void);

thing.c:

static const thing_t s_thing = {
    .type = THING_TYPE_B, 
    .t = {
      .b = { 
        .t = &(thing_t) { .type = THING_TYPE_A, .t = { .a = { .i = 234 } } } 
      }
    },   
};

thing_t const *get_thing(void)
{
  return &s_thing;
}

If you really want to have it at file scope, your only option is, as you said, to use a separate named static object instead of a compound literal:

thing_t const *get_thing(void) {
  static const thing_t tmp_thing = 
          { .type = THING_TYPE_A, .t = { .a = { .i = 234 } } };
  static const thing_t s_thing = {
    .type = THING_TYPE_B,
    .t = {
      .b = {
        .t = &tmp_thing
      }
    },
  };

  return &s_thing;
}
dbush
  • 205,898
  • 23
  • 218
  • 273
  • I'm specifically asking if there's a way to do this at block scope- I tried to address this in my question by describing that the file-scope approach does indeed work! – Charles Nicholson Mar 10 '23 at 04:37
  • 1
    @CharlesNicholson This is the only sensible way (for now). To invent a struct with static storage duration which points at a compound literal with automatic storage duration is plain dangerous and bad design. So it is a good thing that C blocks that from compiling. By declaring a local `static` variable, you solve this problem. – Lundin Mar 10 '23 at 09:26
  • @Lundin I am not asking how to force the address of an automatic-storage-duration variable into the static const pointer. I am asking if it is possible to, inside of a fuction, somehow make the compound literal have static lifetime. I tried to explain that clearly in my original question, sorry if it wasn't explicit enough. – Charles Nicholson Mar 10 '23 at 12:19
  • @CharlesNicholson Not until C23, no. In C23 compound literals will be able to have storage class specifiers and also the new `constexpr` keyword. – Lundin Mar 10 '23 at 12:44