2

So the server sends the data just as packed structures, so what only need to decode is to overlay the structure pointer on the buffer. However one of the structure is a dynamic array kind of data, but I learned that flexible array member is not a C++ standard feature. How can I do it in standard C++ way, but without copying like a vector?

// on wire format: | field a | length | length of struct b |
// the sturcts are defined packed
__pragma(pack(1))
struct B {
    //...
};
struct Msg {
    int32_t a;
    uint32_t length;
    B *data; // how to declare this?
};
__pragma(pack())
char *buf = readIO();
// overlay, without copy and assignments of each field
const Msg *m = reinterpret_cast<const Msg *>(buf);
// access m->data[i] from 0 to length
    
fluter
  • 13,238
  • 8
  • 62
  • 100
  • 1
    Write accessors. You have so many operators to choose from. Does the data come with padding between members? Is it _guaranteed_ that the padding between members is the same as the C++ compiler chooses to do? Is it guaranteed that `field a` will have __at least__ 16 bits and the same count of bits that C++ compiler chooses for `int`? Same for `size_t`. – KamilCuk Jan 25 '22 at 09:58
  • Does this answer your question? [What is the correct way of interop'ing with C flexible array members from C++?](https://stackoverflow.com/questions/43839009/what-is-the-correct-way-of-interoping-with-c-flexible-array-members-from-c) – phuclv Jan 25 '22 at 09:59
  • 1
    For something like this, you will require proper deserialization. Otherwise, there is no way for your program to know the actual size of `data`. – Refugnic Eternium Jan 25 '22 at 10:00
  • @RefugnicEternium by reading the first 2 members you already know the size of data without deserialization – phuclv Jan 25 '22 at 10:02
  • if compiler extensions are allowed then see [this answer](https://stackoverflow.com/a/67894135/995714). – phuclv Jan 25 '22 at 10:03
  • 1
    Is the buffer `buf` returned by `readIO` _guaranteed_ to be properly aligned to `alignof(Msg)`? Are data guaranteed to have the same endianess as your program is using? In short, you can't just simply alias data. – KamilCuk Jan 25 '22 at 10:07
  • yes, the structure are actually packed, i just updated code. endianess is handled but just skipped to focus on the variable length data field problem. i wanna avoid copying that's the main issue. – fluter Jan 25 '22 at 10:18
  • Can't you just write `the_type& operator[](size_t idx) { return *reinterpret_cast(msg + offset); }`? It seems off that you would _want_ to alias the data - write accessors. `int16_t& a() { return *reinterpret_cast(buf); }` `uint64_t& length() { return *reinterpret_cast(buf + sizeof(int16_t)); }` etc.. but still - alignment. And still, data sizes. – KamilCuk Jan 25 '22 at 11:27
  • all the structures are POD, since they match wire format. i try to avoid adding methods into the strucutres. – fluter Jan 25 '22 at 11:28
  • sorry all the fields are fixed size data types, just updated my question. – fluter Jan 25 '22 at 11:30

3 Answers3

2
// on wire format: | field a | length | length of struct b |

You can't overlay the struct, because you can't guarantee that the binary representation of Msg will match the on wire format. Also int is at least 16 bits, can be any number of bits greater than 16, and size_t has various size depending on architecture.

Write actual accessors to the data. Use fixed width integer types. It will only work if the data actually point to a properly aligned region. This method allows you to write assertions and throw exceptions when stuff goes bad (for example, you can throw on out-of-bounds access to the array).

struct Msg {
    constexpr static size_t your_required_alignment = alingof(uint32_t);
    char *buf;
    Msg (char *buf) : buf(buf) {
        assert((uintptr_t)buf % your_required_alignment == 0);
    }
    int32_t& get_a() { return *reinterpret_cast<int32_t*>(buf); }
    uint32_t& length() { return *reinterpret_cast<uint32_t *>(buf + sizeof(int32_t)); }
    struct Barray {
       char *buf;
       Barray(char *buf) : buf(buf) {}
       int16_t &operator[](size_t idx) {
           return *reinterpret_cast<int16_t*>(buf + idx * sizeof(int16_t));
       }
    }
    Barray data() {
        return buf + sizeof(int32_t) + sizoef(uint32_t);
    }
};


int main() {
   Msg msg(readIO());
   std::cout << msg.a() << msg.length();
   msg.data()[1] = 5;
   // or maybe even implement straight operator[]:
   // msg[1] = 5;
}

If the data do not point to a properly aligned region, you have to copy the data, there is no possibility to access them using other types then char.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
1

A standard solution is to not represent the array as a member of the message, but rather as a separate object.

struct Msg {
    int a;
    size_t length;
};

const Msg& m = *reinterpret_cast<const Msg*>(buf);
span<const B> data = {
    reinterpret_cast<const B*>(buf + sizeof(Msg)),
    m.length,
};

Note that reinterpretation / copying of bytes is not portable between systems with different representations (byte endianness, integer sizes, alignments, subobject packing etc.), and same representation is typically not something that can be assumed in network communication.

eerorika
  • 232,697
  • 12
  • 197
  • 326
1

The common way to do this in C was to declare data as an array of length one as the last struct member. You then allocate the space needed as if the array was larger. Seems to work fine in C++ as well. You should perhaps wrap access to the data in a span or equivalent, so the implementation details don't leak outside your class.

#include <string>
#include <span>

struct B {
    float x;
    float y;
};
struct Msg {
    int a;
    std::size_t length;
    B data[1];
};

char* readIO()
{
    constexpr int numData = 3;
    char* out = new char[sizeof(Msg) + sizeof(B) * (numData - 1)]; 
    return out;
}
int main(){
    char *buf = readIO();
    // overlay, without copy and assignments of each field
    const Msg *m = reinterpret_cast<const Msg *>(buf);
    // access m->data[i] from 0 to length
    std::span<const B> data(m->data, m->length);
    for(auto& b: data)
    {
        // do something
    }

    return 0;
}

https://godbolt.org/z/EoMbeE8or

ClockworkV
  • 116
  • 2