60

In C99, you can declare a flexible array member of a struct as such:

struct blah
{
    int foo[];
};

However, when someone here at work tried to compile some code using clang in C++, that syntax did not work. (It had been working with MSVC.) We had to convert it to:

struct blah
{
    int foo[0];
};

Looking through the C++ standard, I found no reference to flexible member arrays at all; I always thought [0] was an invalid declaration, but apparently for a flexible member array it is valid. Are flexible member arrays actually valid in C++? If so, is the correct declaration [] or [0]?

MSN
  • 53,214
  • 7
  • 75
  • 105
  • 4
    Can't you just use a `std::vector` member and worry about more interesting stuff? Or is this a layout issue? – fredoverflow Dec 10 '10 at 19:59
  • 3
    That [flexible-array-member](http://stackoverflow.com/questions/tagged/flexible-array-member) tag seems a bit... lonely. But maybe it's just me. – Marcus Borkenhagen Dec 10 '10 at 20:09
  • 4
    @FredOverflow: there is sometimes a need to have structures that can be used in both C and C++ (system APIs being one very common example). – Michael Burr Dec 10 '10 at 20:45
  • 3
    @FredOverflow, normally I would, but in this case, it's necessary to have a contiguous allocation for `blah` with a variable sized `foo`. It's certainly a good design question as to why we need it in the first place, which I can't get in to here. – MSN Dec 10 '10 at 21:11
  • BTW: An array of size 0 is illegal in both C and C++. – Deduplicator Sep 17 '14 at 13:43
  • @fredoverflow - If you want to represent a growing shared memory area and all you know of its contents at design-time is that it's a blob of bytes, vector is a poor choice. And that's just one example. – hoodaticus May 17 '18 at 15:34
  • It is always nice and efficient to avoid another level of indirection. Much like with bitfields, in this case I also prefer to calculate the layouts manually and be done with it. Yes, there is no compiler guaranteed type safety, but as long as you know what you are doing it is 100% fine and safe, and gives you functionality that neither the standard nor the compiler otherwise provide. – dtech May 22 '20 at 16:07
  • 4
    Actually, that specific construction is illegal according to the [C99 standard](http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf), as it states: "as a special case, the last element of a structure with **more than one** named member may have an incomplete array type; this is called a _flexible array member_." So `blah` needs an extra member, one before `foo`, to be valid in C99. – HelloGoodbye Oct 18 '20 at 20:24

10 Answers10

35

C++ was first standardized in 1998, so it predates the addition of flexible array members to C (which was new in C99). There was a corrigendum to C++ in 2003, but that didn't add any relevant new features. The next revision of C++ (C++2b) is still under development, and it seems flexible array members still aren't added to it.

StoryTeller - Unslander Monica
  • 165,132
  • 21
  • 377
  • 458
Martin v. Löwis
  • 124,830
  • 17
  • 198
  • 235
35

C++ doesn't support C99 flexible array members at the end of structures, either using an empty index notation or a 0 index notation (barring vendor-specific extensions):

struct blah
{
    int count;
    int foo[];  // not valid C++
};

struct blah
{
    int count;
    int foo[0]; // also not valid C++
};

As far as I know, C++0x will not add this, either.

However, if you size the array to 1 element:

struct blah
{
    int count;
    int foo[1];
};

the code will compile, and work quite well, but it is technically undefined behavior. You can allocate the appropriate memory with an expression that is unlikely to have off-by-one errors:

struct blah* p = (struct blah*) malloc( offsetof(struct blah, foo[desired_number_of_elements]);
if (p) {
    p->count = desired_number_of_elements;

    // initialize your p->foo[] array however appropriate - it has `count`
    // elements (indexable from 0 to count-1)
}

So it's portable between C90, C99 and C++ and works just as well as C99's flexible array members.

Raymond Chen did a nice writeup about this: Why do some structures end with an array of size 1?

Note: In Raymond Chen's article, there's a typo/bug in an example initializing the 'flexible' array. It should read:

for (DWORD Index = 0; Index < NumberOfGroups; Index++) { // note: used '<' , not '='
  TokenGroups->Groups[Index] = ...;
}
NathanOliver
  • 171,901
  • 28
  • 288
  • 402
Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 13
    However, even if you allocate excess memory you still can't validly access members outside of the array bounds of one element. The behaviour is undefined; a C++ implementation would be within its rights to add bounds checking according to the actual type of the object constructed. – CB Bailey Dec 10 '10 at 21:14
  • 2
    @Charles - I don't think you're right about that (even pedantically), otherwise the following would be undefined behavior: `int* p = malloc(sizeof(int)*4); p[3] = 0;`. – Michael Burr Dec 10 '10 at 22:01
  • 3
    @Michael: I think that the reason Charles said *one* element is not because he thinks it's impossible to allocate an array, but rather because 1 is the length of the array in your particular struct blah. The claim is that since `p->foo` is of type `blah[1]`, then `p->foo[1]` is UB. However, although `p->foo[1]` is outside the object `foo`, it isn't outside the array of `char` that was allocated with `malloc`, so it *is* inside an object. With suitable casts via `char*` read access at least would be fine. I can't remember how the standards legalese falls out, though. – Steve Jessop Dec 10 '10 at 22:23
  • Also, I can't remember whether it's legal for the structure `blah` to contain some padding after `foo`, that the implementation uses e.g. to detect buffer overruns using a magic number that should still be there later (and which perhaps is a trap representation for `int` on some exotic architecture). The implementation can't do that in an array, but possibly can in a class containing an array. Anyway, I think you need an implementation-specific guarantee to pull the trick. – Steve Jessop Dec 10 '10 at 22:24
  • There's no need for suitable casts to `char*`. `malloc()` "allocates space for an object whose size if specified by `size`". The index operator just performs pointer arithmetic based on `(p->foo)'s type and offset. If the object being pointed to is large enough for the resulting pointer arithmetic (and `malloc()` took care of that job) then I think there no UB. – Michael Burr Dec 10 '10 at 22:36
  • Also, just to be clear, I don't claim this 'struct-hack' technique to be valid for non-POD types. – Michael Burr Dec 10 '10 at 22:39
  • 12
    Well, I guess I'll have to eat my words. None other than WG14 has stated that this is UB (Defect Report 51: http://www.open-std.org/Jtc1/sc22/wg14/www/docs/dr_051.html). However, I contend that 1) the 'safer idiom' suggested by WG14 in DR51 is flat-out ridiculous, 2) the UB behaves as expected on all platforms that are important to me, and 3) alternatives (that also aren't UB) are less convenient and/or more error-prone to use (and therefore more likely to cause observable bugs) - so I'll likely continue to use it. But now at least I'll know I'm breaking a rule... – Michael Burr Dec 10 '10 at 22:59
  • 5
    Yes, I wasn't claiming that it isn't a useful technique, just that it's not strictly conforming. Unfortunately the safer idiom is also UB for a different reason. You can only perform pointer arithmetic on an object that actually exists and a POD object only starts to exist once memory of sufficient alignment and _size_ is allocated. If something is declared with a very large array and you allocate not enough space for it, it can't start to exist. – CB Bailey Dec 10 '10 at 23:46
  • 2
    Also, (again only talking language-lawyer curiosities) I don't believe that your `offsetof` invocation is strictly conforming either, because `foo[desired_number_of_elements]` doesn't designate a member of a hypothetical static `blah` object. I think you would have to do `offsetof(blah, foo) + desired_number_of_elements`. – CB Bailey Dec 10 '10 at 23:50
  • 1
    Anyone else notice that the DR is discussing a nonexistent `->>` operator? – Ben Voigt May 30 '13 at 15:07
5

If you can restrict your application to only require a few known sizes, then you can effectively achieve a flexible array with a template.

template <typename BASE, typename T, unsigned SZ>
struct Flex : public BASE {
    T flex_[SZ];
};
jxh
  • 69,070
  • 8
  • 110
  • 193
  • I don't understand your comment about "restrict your application to only require a few known sizes", I am not clear why that would that be a consideration when using this approach. Can you explain any further? – Mark Ch Dec 10 '18 at 22:25
  • @MarkCh: A template has to be instantiated at compile time, so the size of `flex_` will be fixed at compile time. But, you can use as many sizes as you want, each size will denote a different type. – jxh Dec 10 '18 at 22:37
  • 2
    Fixed and flexible sound quite contradictory to me. – dtech May 22 '20 at 15:57
4

The second one will not contain elements but rather will point right after blah. So if you have a structure like this:

struct something
{
  int a, b;
  int c[0];
};

you can do things like this:

struct something *val = (struct something *)malloc(sizeof(struct something) + 5 * sizeof(int));
val->a = 1;
val->b = 2;
val->c[0] = 3;

In this case c will behave as an array with 5 ints but the data in the array will be after the something structure.

The product I'm working on uses this as a sized string:

struct String
{
  unsigned int allocated;
  unsigned int size;
  char data[0];
};

Because of the supported architectures this will consume 8 bytes plus allocated.

Of course all this is C but g++ for example accepts it without a hitch.

terminus
  • 13,745
  • 8
  • 34
  • 37
  • That's pretty interesting. I imagine that you wouldn't be able to ever pass this struct to a function as value, right? as that would probably just pass the `sizeof(String)` which wouldn't take into account the size you allocated for `data`. But it should work as long as you only pass it as reference or pointer, is that right? – filipe Dec 10 '10 at 20:15
  • Well, you can pass it but it will only pass the data in the string. Also, depending on the compiler it will generate some warnings. – terminus Dec 10 '10 at 20:19
  • 5
    This is not true. `T[0]` is neither a valid type specifier in C nor in C++. You have to use `T[]`. – Johannes Schaub - litb Dec 11 '10 at 10:35
  • 1
    @Johannes it is valid in C99, see http://www.open-std.org/jtc1/sc22/wg14/www/newinc9x.htm – terminus Dec 11 '10 at 11:10
  • 5
    `Of course all this is C but g++ for example accepts it without a hitch.` That's nice for g++. Per the Standard, it's still UB and so should be killed with fire. @terminus That link only mentions the existence of flexible array members in C99 (but never C++); it does not support your contention that `[0]` is a valid syntax for declaring them, which it is not. – underscore_d Jul 03 '16 at 05:09
  • 6
    Zero-length arrays were never valid in any version of C. You can't have zero-length VLAs. It is a gcc non-standard extension. This code will not compile in standard C nor standard C++. – Lundin Dec 16 '16 at 15:15
4

If you only want

struct blah { int foo[]; };

then you don't need the struct at all an you can simply deal with a malloc'ed/new'ed int array.

If you have some members at the beginning:

struct blah { char a,b; /*int foo[]; //not valid in C++*/ };

then in C++, I suppose you could replace foo with a foo member function:

struct blah { alignas(int) char a; char b; 
    int *foo(void) { return reinterpret_cast<int*>(&this[1]); } };

Example use:

#include <stdlib.h>
struct blah { 
    alignas(int) char a;
    char b;
    ////////
    int *foo(void) { return reinterpret_cast<int*>(&this[1]); }
};
int main()
{
    blah *b = (blah*)malloc(sizeof(blah)+10*sizeof(int));
    if(!b) return 1;
    b->foo()[1]=1;
}

There's no strict aliasing issues with this type of casting of the memory past the initial struct here because the memory is dynamic (has no declared type).

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • It seems to me this method might have some holes as currently written. What would `alignas(int)` do if the `struct` were to have a `double` value, for example? (`double` being a type that commonly has stricter alignment requirements than `int`) – Andrew Henle May 20 '23 at 12:23
  • @AndrewHenle Substitute `double` for `int` then (both in alignas and in the cast/return-value). In general, I don't think there's anything in C or C++ that prevents you from having an array just after a malloced/aligned_alloced struct just as long as (1) there's space for it and (2) the struct has equal or larger alignment so that the "just after" (`&this[1]`) is sufficiently aligned. – Petr Skocik May 20 '23 at 12:42
  • @AndrewHenle C's flexible array members basically just (1) automatically give you an accessor (the name) instead of you having to access it via `(target*)(&this[1])` or a member method that does the same (2) allow the flexible array to possibly start already in the initial struct's end padding if the alignment requirement for the array is smaller that for the initial struct. – Petr Skocik May 20 '23 at 12:44
  • @AndrewHenle I've heard some (IMO competent) people argue that flexible array members aren't even needed and that there should be no issues indexing past a final 1-sized array provided there's space. I wouldn't bet on it and in the absence of FEM's I'd probably just do what I've described here without being stingy about every last padding byte. – Petr Skocik May 20 '23 at 12:48
  • I was merely concerned that using `alignas()` for a less-strictly-aligned type could force a more-strictly-aligned type of `struct` element to an invalid alignment. Substituting the `alignas()` to use the most-restrictive element type isn't really a good solution as a complex `struct` of other `struct`s is subject to having one of its sub-`struct`s change by adding a more restrictively aligned type. – Andrew Henle May 20 '23 at 12:50
  • 1
    *I've heard some (IMO competent) people argue that flexible array members aren't even needed and that there should be no issues indexing past a final 1-sized array provided there's space.* Yeah, no. It's UB [and it can bite you](https://lkml.org/lkml/2015/2/18/407) – Andrew Henle May 20 '23 at 12:50
  • 1
    @AndrewHenle Yeah, that's why I said I wouldn't bet on it. :) As for alignas inappropriately *reducing* alignment that can never happen (def. with _Alignas, I'm presuming C++'s behaves the same). It's a constraint violation (=compiler error) to attempt to do so. – Petr Skocik May 20 '23 at 12:54
  • 1
    With that, I'd say your method here is a significant improvement on the final 1-sized array "`struct` hack". I don't see any way UB is invoked, based on your description of `alignas()`, which was my original concern. – Andrew Henle May 20 '23 at 12:59
  • @AndrewHenle Thanks. To be generic, you'd still want to avoid the compiler error for when the `alignas` to the flex-member type would wanna reduce the alignment requirement for the first struct member. So something like `alignas(alignof(flexttype)>alignof(firstmembtype) ? alignof(flexttype) : alignof(firstmembtype) ) firstmemtype firstmemb;` Then it could be packaged into a macro (or perhaps a C++ template). :) – Petr Skocik May 20 '23 at 14:05
3

A proposal is underway, and might make into some future C++ version. See http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1039r0.html for details (the proposal is fairly new, so it's subject to changes)

pqnet
  • 6,070
  • 1
  • 30
  • 51
1

I faced the same problem to declare a flexible array member which can be used from C++ code. By looking through glibc headers I found that there are some usages of flexible array members, e.g. in struct inotify which is declared as follows (comments and some unrelated members omitted):

struct inotify_event
{
  //Some members
  char name __flexarr;
};

The __flexarr macro, in turn is defined as

/* Support for flexible arrays.
   Headers that should use flexible arrays only if they're "real"
   (e.g. only if they won't affect sizeof()) should test
   #if __glibc_c99_flexarr_available.  */
#if defined __STDC_VERSION__ && __STDC_VERSION__ >= 199901L
# define __flexarr  []
# define __glibc_c99_flexarr_available 1
#elif __GNUC_PREREQ (2,97)
/* GCC 2.97 supports C99 flexible array members as an extension,
   even when in C89 mode or compiling C++ (any version).  */
# define __flexarr  []
# define __glibc_c99_flexarr_available 1
#elif defined __GNUC__
/* Pre-2.97 GCC did not support C99 flexible arrays but did have
   an equivalent extension with slightly different notation.  */
# define __flexarr  [0]
# define __glibc_c99_flexarr_available 1
#else
/* Some other non-C99 compiler.  Approximate with [1].  */
# define __flexarr  [1]
# define __glibc_c99_flexarr_available 0
#endif

I'm not familar with MSVC compiler, but probably you'd have to add one more conditional macro depending on MSVC version.

St.Antario
  • 26,175
  • 41
  • 130
  • 318
1

Flexible arrays are not part of the C++ standard yet. That is why int foo[] or int foo[0] may not compile. While there is a proposal being discussed, it has not been accepted to the newest revision of C++ (C++2b) yet.

However, almost all modern compiler do support it via compiler extensions.

The catch is that if you use this extension with the highest warning level (-Wall --pedantic), it may result into a warning.

A workaround to this is to use an array with one element and do access out of bounds. While this solution is UB by the spec (dcl.array and expr.add), most of the compilers will produce valid code and even clang -fsanitize=undefined is happy with it:

#include <new>
#include <type_traits>

struct A {
    int a[1];
};

int main()
{
    using storage_type = std::aligned_storage_t<1024, alignof(A)>;
    static storage_type memory;
    
    A *ptr_a = new (&memory) A;

    ptr_a->a[2] = 42;
    
    return ptr_a->a[2];
}

demo


Having all that said, if you want your code to be standard compliant and do not depend on any compiler extension, you will have to avoid using this feature.
ivaigult
  • 6,198
  • 5
  • 38
  • 66
0

Flexible array members are not supported in standard C++, however the clang documentation says.

"In addition to the language extensions listed here, Clang aims to support a broad range of GCC extensions."

The gcc documentation for C++ says.

"The GNU compiler provides these extensions to the C++ language (and you can also use most of the C language extensions in your C++ programs)."

And the gcc documentation for C documents support for arrays of zero length.

https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html

plugwash
  • 9,724
  • 2
  • 38
  • 51
-2

The better solution is to declare it as a pointer:

struct blah
{
    int* foo;
};

Or better yet, to declare it as a std::vector:

struct blah
{
    std::vector<int> foo;
};
Zac Howland
  • 15,777
  • 1
  • 26
  • 42
  • 2
    Neither of these are easily serializable which is the whole point of flexible array members. – doron Dec 10 '10 at 20:11
  • 4
    no, int[0] does not create a pointer. See answer by terminus. – kriss Dec 10 '10 at 20:12
  • @doron: The vector solution is serializeable as vectors are guaranteed to be contiguous. Even the pointer version is fairly easy to serialize. – Zac Howland Dec 10 '10 at 20:34
  • @kriss: Edited - I submitted before I meant to. I was trying to say it allows you to create a data member that behaves like a pointer, but removed it as it could be confusing. In C++, there is really no need to even bother with this syntax "hack" that was used in C. Sorry for the confusion. – Zac Howland Dec 10 '10 at 20:38
  • 1
    @Zac Howland: No problem. I would call that an address, and it's interest is that it also allow to define a **memory aligned** zero length member in a structure. Even with C++ there is cases when it's useful, when dealing with hardware aware low level programs. And I do agree with doron about serialization. – kriss Dec 10 '10 at 21:20
  • @ZacHowland: Elements in the vector are guaranteed to be contiguous *with each other*, but are guaranteed not to be contiguous with other members of `struct blah`. The flexible-array elements on the other hand, are contiguous with other members of `struct blah`. – Ben Voigt Jun 08 '21 at 21:06