Flexible array member without having to be the last one

Question

I am trying to figure out whether there is a workaround in C to have a flexible array member in a struct(s), that is not the last one. For example, this yields compilation error:

typedef struct __attribute__((__packed__))
{
    uint8_t         slaveAddr;      /*!< The slave address byte */

    uint8_t         data[];         /*!< Modbus frame data (Flexible Array
                                    Member) */
    
    uint16_t        crc;            /*!< Error check value */

} rtuHead_t;

This does not yield an error:

typedef struct __attribute__((__packed__))
{
    uint8_t         slaveAddr;      /*!< The slave address byte */

    uint8_t         data[];         /*!< Modbus frame data (Flexible Array
                                    Member) */

} rtuHead_t;

typedef struct __attribute__((__packed__))
{
    rtuHead_t       head;           /*!< RTU Slave addr + data */

    uint16_t        crc;            /*!< Error check value */

} rtu_t;

But does not work. If I have an array of bytes: data[6] = {1, 2, 3, 4, 5, 6}; and cast it to rtu_t, then crc member will equal 0x0302, not 0x0605.

Is there any way to use the flexible array members in the middle of the struct (or struct in a struct)?

There is no work-around (apart from not using a flexible member in the first place), and I'm *almost* curious what possible reason there could possibly be to think this is needed in the first place. Regardless, flexible members *must* be the last member of their defined structure. — WhozCraig, Mar 14 '21 at 22:01
The standard requires `offsetof()` to expand to an integer constant expression. That's would not be possible if a member could follow a flexible array member. — Keith Thompson, Mar 14 '21 at 23:14

tstanisl · Accepted Answer · 2021-09-03T11:11:49.680

It cannot be done in ISO C. But...

The GCC has an extension allowing Variably Modified types defined within the structures. So you can define something like this:

#include <stddef.h>
#include <stdio.h>

int main() {
    int n = 8, m = 20;
    struct A {
        int a;
        char data1[n];
        int b;
        float data2[m];
        int c;
    } p;

    printf("offset(a) = %zi\n", offsetof(struct A, a));
    printf("offset(data1) = %zi\n", offsetof(struct A, data1));
    printf("offset(b) = %zi\n", offsetof(struct A, b));
    printf("offset(data2) = %zi\n", offsetof(struct A, data2));
    printf("offset(c) = %zi\n", offsetof(struct A, c));
    return 0;
}

Except a few warnings about using non-ISO features it compiles fine and produces expected output.

offset(a) = 0
offset(data1) = 4
offset(b) = 12
offset(data2) = 16
offset(c) = 96

The issue is that this type can only be defined at block scope thus it cannot be used to pass parameters to other functions.

However, it could be passed to a nested function, which is yet-another GCC extensions. Example:

int main() {
   ... same as above

    // nested function
    int fun(struct A *a) {
        return a->c;
    }
    return fun(&p);
}

score 2 · Answer 2 · answered Mar 14 '21 at 22:03

2

A flexible array member must be the last member of the struct, and a struct containing a flexible array member may not be a member of an array or another struct.

The intended use of such a struct is to allocate it dynamically, putting aside enough space for the other members plus 0 or more elements of the flexible member.

What you're attempting to do is overlay a struct onto a memory buffer that contains packet data that you want to parse simply by accessing the members. That's not possible in this case, and in general doing so is not a good idea due to alignment and padding issues.

The proper way to do what you want is to write a function that deserializes the packet one field at a time and places the result in a user-defined structure.

answered Mar 14 '21 at 22:03

dbush

205,898
23
218
273

Hi, thank you for the answer. For the sake of completeness and leaving the alignment out of the way, I added the packed attributes. – Łukasz Przeniosło Mar 14 '21 at 22:11
1

@ŁukaszPrzeniosło *leaving the alignment out of the way* You can't do that safely. [Not even on x86 systems](https://stackoverflow.com/questions/47510783/why-does-unaligned-access-to-mmaped-memory-sometimes-segfault-on-amd64). "But it works" is more accurately worded as "I haven't observed it fail - ***yet***". – Andrew Henle Mar 15 '21 at 09:40

score 0 · Answer 3 · answered Mar 16 '21 at 14:11

Flexible array members can only be placed at the end of the struct. That's just how the C standard 6.7.2.1 defines them:

As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.

But for the specific case they are also the wrong solution to the wrong question. The wrong question being "how do I store a variable size Modbus data protocol frame inside a C struct"? struct is often better to avoid in the first place. Us C programmers have unfortunately been pretty much brainwashed to use struct in every single situation, to the point where we just declare one without second thought.

There's various problems with structs, most notably the alignment/padding one which can only be solved with non-standard extensions like gcc __attribute__((__packed__)) or #pragma pack(1). But even if you use those, you end up with a chunk that the compiler may still access misaligned - you only told it to drop padding "I know what I'm doing". But if you go ahead and word access that memory, it may be a misaligned access.

Then there's the problem with variable-sized protocols. Resizing that memory chunk over and over adds depending on the amount of data received actually doesn't achieve much except bloat and program execution overhead. How much memory are you saving by doing so? Some 10 to 100 bytes? That's nothing even in low end MCUs. Since you only need to keep a few frames in RAM at the same time.

It turns out that you are going to have to allocate enough memory to store the largest frame ever appearing, since your program must handle that worst case. And then you could as well allocate that much memory to begin with, statically. Much faster, safer, deterministic.

And then there's yet another problem which you don't seem to address, namely network endianess. Modbus uses big endian and CRC are calculated in big endian. So the uint16_t member at the end of the struct just sits there to create problems. Even if you would decide to use some non-standard GNU VLA extension in order to resize each frame.

I would advise you to forget all about these structs.

The fast, portable and safe solution is to simply use a uint8_t frame [MAX]; where MAX is the maximum size in bytes that a frame could ever have. Using a struct just to give a variable name to one specific byte in the frame doesn't actually add anything in itself. What you really want is to have readable code easily explaining what each byte does, rather than an anonymous buffer of raw data.

This could as well be done with named indices (for example enum) of this uint8_t array when accessing it. There's no difference in readability, purpose or machine code generated between struct version frame.slave_addr = x; and array version frame[slave_addr] = x;. (Except the former might cause misaligned access in the machine code.)

You'll need to access the CRC byte by byte anyway, since you first need to calculate it with your CPU endianess, then convert it to network endianess. For example:

frame[fcs_high] = checksum >> 8; 
frame[fcs_low]  = checksum & 0xFF;

This code doesn't depend on CPU endianess unlike the struct, which will only work as expected on big endian CPUs.

Thanks for the answer. As for the extended part, I akbowledge it, but dont agree with it. Following your approach will lead to coding in assembly, because "only then you have full control". — Łukasz Przeniosło, Mar 16 '21 at 18:19
@ŁukaszPrzeniosło The reason for not choosing structs to represent data protocol is not some "low level vs high level" thing, it's a choice between the right tool and the wrong tool. Structs were simply not intended to be used for data protocols. At the dawn of time when C was designed, nobody even predicted alignment CPUs. Let alone that C would be used for data communication or hardware-related programming. So C simply does not provide any convenient concept for memory mapping high level data. — Lundin, Mar 17 '21 at 08:03

alx - recommends codidact · Answer 4 · 2023-05-11T14:51:25.653

There is a way to get something like what you want.

You'll just need a few more bytes to store the offsets of the fields:

struct {
    uint8_t    slaveAddr;      /*!< The slave address byte */

    ptrdiff_t  modbus_off;
    ptrdiff_t  crc_off;

    uint8_t    data[];   /* this will hold all the magic stuff */
} rtuHead_t;

You now need to decide something yourself: do you want the offset to apply to the struct address (a)? Or maybe to the address of the flexible array (b)? Or maybe to the address of the offset itself (c)? Depending on your answer, the way to access the data will be slightly different.

a)

crc = *(uint16_t *) ((char *) s + crc_off);

b)

crc = *(uint16_t *) (s->data + crc_off);

c)

crc = *(uint16_t *) ((char *) s + offsetof(rtuHead_t, crc_off) + crc_off);

It is up to you to make sure the alignment of the fields is correct, and set the offsets accordingly.

Below goes an example program that makes use of this (slightly different, but the idea is the same) to hold several arbitrarily long strings in a single plain structure.

$ cat flexi2.c 
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

struct s {
    int        _;
    ptrdiff_t  off[];
};

int
main(void)
{
    char      *p;
    struct s  *s;

    s = malloc(offsetof(struct s, off) +
               sizeof(ptrdiff_t) * 2 +
               sizeof("foobar") + sizeof("baz"));

    p = (char *) s + offsetof(struct s, off) + sizeof(ptrdiff_t) * 2;

    s->off[0] = p - (char *) s;
    p = stpcpy(p, "foobar") + 1;
    s->off[1] = p - (char *) s;
    p = stpcpy(p, "baz") + 1;

    puts((char *) s + s->off[0]);
    puts((char *) s + s->off[1]);

    free(s);
}

$ gcc-13 -Wall -Wextra -Werror -fanalyzer -O3 flexi2.c 
$ ./a.out 
foobar
baz

Flexible array member without having to be the last one

4 Answers4

Linked