3

I was trying to write a macro to detect whether a struct member is a flexible array or just a regular one.

Turns out clang treats flexible array types as incomplete (yet-to-be-sized) array types.

Incomplete array types can be detected by being, on different compatibility test, compatible with different specific sizes:

#define ISCOMPWITHARRAYOFN(LVAL,N) _Generic((typeof((LVAL)[0])(*)[N])0, default:0,typeof(&(LVAL)):1)
#define ISINCOMPLETE_ARRAY(LVAL) ( ISCOMPWITHARRAYOFN(LVAL,1) && ISCOMPWITHARRAYOFN(LVAL,2) )

extern char incomplete[];
extern char complete[1];

//accepted by both gcc and clang
_Static_assert(ISINCOMPLETE_ARRAY(incomplete),"");
_Static_assert(!ISINCOMPLETE_ARRAY(complete),"");

This means that on clang, I can have:

#define ISFLEXIBLE(type,member) ISINCOMPLETE_ARRAY((type){0}.member)

struct flexed{ int a; char m[]; };
struct unflexed0{ int a; char m[1]; };
struct unflexed1{ int a; char m[1]; int b; };

//both GCC and clang accept these:
_Static_assert(!ISFLEXIBLE(struct unflexed0,m),"");
_Static_assert(!ISFLEXIBLE(struct unflexed1,m),"");

//only clang accepts these
_Static_assert(ISFLEXIBLE(struct flexed,m),"");

_Static_assert(ISCOMPWITHARRAYOFN((struct flexed){0}.m,1),"");
_Static_assert(ISCOMPWITHARRAYOFN((struct flexed){0}.m,2),"");

but GCC does not accept this.

My question is which of gcc/clang behaves incorrectly here and is it possible to write a (perhaps nonstandard) ISFLEXIBLE(type,array_typed_member) macro that works on both compilers?

https://godbolt.org/z/e8jb19bbK

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • Weakly related: https://stackoverflow.com/questions/72887333/detect-that-a-struct-contains-a-flexible-array-member – Petr Skocik May 20 '23 at 10:31
  • 1
    For a [mre], you should reduce this to a single `_Generic` that GCC and Clang process differently. – Eric Postpischil May 20 '23 at 11:18
  • @EricPostpischil: Done. Split off into an `#define ISCOMPWITHARRAYOFN(LVAL,N) _Generic((typeof((LVAL)[0])(*)[N])0, default:0,typeof(&(LVAL)):1)` macro. On clang, `_Static_assert(ISCOMPWITHARRAYOFN((struct flexed){0}.m,N),"");` where `N ∈ {1,2...}` holds. On gcc it doesn't. – Petr Skocik May 20 '23 at 11:33
  • 3
    Experimenting shows GCC treats the flexible array member as an array of zero length rather than an array of unknown length. – Eric Postpischil May 20 '23 at 11:41
  • FYI: similar techniques: [1](https://stackoverflow.com/q/49480442/1778275), [2](https://stackoverflow.com/a/65523196/1778275), [3](https://stackoverflow.com/a/5672637/1778275). – pmor Jun 23 '23 at 13:47

1 Answers1

6

The problem can be reproduced with the single line:

_Static_assert(_Generic(&(struct {int a, m[]; }){0}.m, default: 0, int (*)[1]: 1), "");

Clang accepts this. GCC does not. If the first 1 is changed to 0, GCC accepts it (and so does Clang). This shows that Clang treats the flexible array member as an array of unknown length, whereas GCC treats the flexible array member as an array of zero length.

C 2018 6.7.2.1 18 says:

As a special case, the last member of a structure with more than one named member may have an incomplete array type; this is called a flexible array member… However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed;… If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.

So, the last member of the structure has incomplete array type, as Clang treats it. But then we must consider the statements about an expression using . or ->. Perhaps these are solely intended as statements about behavior of the program rather than the type of the expression, that is, statements about the run-time actions such as reading and writing the array elements. If so, Clang is correct to treat it as an incomplete array type.

However, if the statements are intended to specify the type, then GCC is still wrong, since, if the size allows for no elements, as the compound literal presumably does, then it ought to be treated as an array of one element. But GCC treats it as an array of zero elements.

Further experimenting reveals that even if the structure is created using an initializer with elements for the flexible array member (which is a GCC extension), GCC still treats its type as an array of zero elements, even though it actually has more.

So GCC does not conform to the C standard.

Clang’s interpretation is reasonable; treating the array as complete when it is referred to with . or -> would, in some circumstances (such as when it is passed a pointer to one of these structures, which may be pointing to memory with plenty of space for the array elements) require knowledge not available to it at compile time.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • 1
    Thank you. Redefining `ISINCOMPLETE_ARRAY(LVAL)` to `#define ISINCOMPLETE_ARRAY(LVAL) ( (ISCOMPWITHARRAYOFN(LVAL,1) && ISCOMPWITHARRAYOFN(LVAL,2)) || ISCOMPWITHARRAYOFN(LVAL,0))` makes it work on flexible array members on both compilers (also tinycc): https://godbolt.org/z/zr9MY6b7f. – Petr Skocik May 20 '23 at 16:10
  • Re: "GCC does not conform to the C standard": does GCC team knows about this case? In other words: is there a bug report for this case? – pmor Jun 23 '23 at 13:41
  • About Clang: why static assertion is passed in case of `int (*)[1]`? The `m` has incomplete array type, whereas the `int (*)[1]` is a pointer to a complete array type. Note that static assertion is passed in both GCC and Clang in case of `int (*)[]`. – pmor Jun 23 '23 at 14:15
  • Extra: about the "the last element of a structure with more than one named member". Where is it required that such named member shall have a complete type? – pmor Jun 23 '23 at 14:17
  • @pmor: Re “About Clang: why static assertion is passed in case of `int (*)[1]`? The `m` has incomplete array type, whereas the `int (*)[1]` is a pointer to a complete array type.”: `_Generic` selects a compatible type; it is not required to be the same type. Two types are compatible, roughly speaking, they can be completed to be the same type. `int (*)[]` can be completed to be `int (*)[1]`. – Eric Postpischil Jun 23 '23 at 14:49
  • @pmor: Re “Where is it required that such named member shall have a complete type?”: The sentence you partially quoted says “A structure or union shall not contain a member with incomplete or function type…”. Therefore, if a member is in a structure or union, it shall not have incomplete type or function type, so it must have complete type, unless it qualifies for the exception following in that sentence, “… except that the last member of a structure with more than one named member may have incomplete array type…” – Eric Postpischil Jun 23 '23 at 14:52
  • @EricPostpischil Thanks! Btw, I've quoted C11, 6.7.2.1p18 whereas you've probably thought that I've quoted C11, 6.7.2.1p3. There are similar wordings: C11, 6.7.2.1p18: "the last element of a structure with more than one named member...", C11, 6.7.2.1p3: "the last member of a structure with more than one named member...". Here we see that "element of a structure" and "member of a structure" are synonyms. (Why use synonyms at all?) – pmor Jun 24 '23 at 12:00
  • About `_Generic`. Indeed, there is "compatible with the type" (C11, 6.5.1.1p3) instead of "equal to the type". I forgot that `_Generic` selects a compatible type. – pmor Jun 24 '23 at 12:04
  • 1) Does GCC team know about the bug? 2) FYI: MSVC fails to compile this code `int x[]; _Static_assert(_Generic(&x, default:0, int(*)[1]:1 ), "");`. Reported to MSVC team. – pmor Jun 30 '23 at 15:46