2

I have a couple of questions regarding the nitty-gritty of c++ struct/classes.

i) In C++, can struct members with different access modifier be reordered by the compiler? As per here, compiler can reorder members with different access. If that is the case then How do we guarantee members are correctly initialized? For e.g.

Struct S {  
private:
    int a;
public:
    int b;
S() : a(1), b(a) {}
}

If compiler can rearrange a and b then b can take any arbitrary value, isn't it?

ii) Let's consider the struct

Struct S {  
private:        int a;
public:        int b;
}

If we serialize this struct and read it in another program, could it fail if the ordering of the members is not guaranteed?

iii) In C, we can use the struct pointer and cast it to the pointer of its first data member. Could we do the same in C++? Are there any restrictions and what about reordering in that situation?

There are other posts on SO that discuss class member reordering but it's still not clear to me whether this is allowed and done by the compiler.

Thanks

  • 3
    (i) Members may be rearranged in the memory layout, but they still must be initialized in the order they are listed in the class definition. The two orders don't have to be the same. – Igor Tandetnik Dec 25 '20 at 15:27
  • 1
    (ii) Define "serialize". If you mean just dump the byte representation, and then later copy that byte representation into an instance of `S`, then this would exhibit undefined behavior since `S` is not POD. – Igor Tandetnik Dec 25 '20 at 15:29
  • None of the compilers I use do class member reordering. POD struct need to be layout compatible with C. From Igor's comment, non-POD struct or class objects could have their layout rearranged. If you want the standard's chapter and verse cited, you may want to add the `language-lawyer` tag. Serializing C++ objects should be done with care, and not with a memory dump, even for POD types. – Eljay Dec 25 '20 at 15:30
  • 1
    (iii) Yes, you could do the same in C++ - for a POD struct (which is essentially a structure that doesn't use any C++ features, and could just as well be written in C; "POD" stands for "plain old data"). A struct that has `private` members is not POD. – Igor Tandetnik Dec 25 '20 at 15:31
  • Don't include multiple questions at once. You can read more about it [here](https://meta.stackexchange.com/q/39223/395477) – Ted Klein Bergman Dec 25 '20 at 15:34
  • @IgorTandetnik, just out of curiously, in programming parlance, does "serialize" mean anything else besides dumping memory into bytes? – user14757101 Dec 30 '20 at 18:00
  • Of course not; or at least, not as simple as `memcpy(buffer, &some_struct, sizeof(some_struct))`. Say `struct Person {std::string first_name, last_name;}` cannot be reasonably serialized this way. If only because `sizeof(Person)` is a compile-time constant, while the person's name could be arbitrarily long. See also: [Serialization](https://en.wikipedia.org/wiki/Serialization) – Igor Tandetnik Dec 30 '20 at 18:32
  • Okay. So serialization requires that rereading the object should reconstruct the object, which may not be possible with memcpy calls. – user14757101 Dec 30 '20 at 18:58

1 Answers1

1

With respect to your first question: the order of members in memory is only guaranteed without intervening access specifier. However, the order in memory does not affect the construction or destruction order.

Assuming you actually serialize the data, the memory organization clearly doesn’t matter. Of course, simply taking the bytes in memory for the struct doesn’t really do proper serialization. In particular, there are no guarantees on how values are actually represented and size and endianess of built-in types are well-know variations. Also, potential padding needs to be taken into account. ... and, of course, possibly reordered members.

I strongly recommend to not just write the bytes! I have seen more than large company struggling massively with developers having taken that approach: any successful software will grow and outlive the immediate context and these short-cuts become massively harmful! I have seen some people who thought it is clever to effectively store important data in databases by memcpy()ing them into a blob (Binary Large OBject): aside from not being accessible to queries the data becomes easily lost if anything changes.

For C-like structs or classes, i.e., for standard-layout types (PODs in the past), the same rule applies: the address of the object is the address of the first member. If you use certain C++ constructs (inheritance, virtual functions, access specifiers, or structors and possibly others I’m forgetting) this guarantee does not apply any longer.

Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380