0

Consider the two structs below:

struct A {
    double x[3];
    double y[3];
    int z[3];

    struct A *a;
    int b;
    struct A *c;
    unsigned d[10];
};

struct B {
    double x[3];
    double y[3];
    int z[3];
};

Notice that struct B is a strict subset of struct A. Now, I want to copy the members .x, .y and .z from an instance of struct A to an instance of struct B. My question is: according to the standards, is it valid to do:

struct A s_a = ...;
struct B s_b;
memcpy(&s_b, &s_a, sizeof s_b);

I.e. is it guaranteed that the paddings for the members, in their sequence of appearance, will be the same, so that I can "partially" memcpy struct A to struct B?

lvella
  • 12,754
  • 11
  • 54
  • 106
  • Not at all. I'm sure someone will find the standards, but padding is implementation dependent unless there are very specific flags in the code. – Jashaszun Jul 29 '15 at 21:13
  • as mentioned, padding would be in the way and implementation dependent. Have you tried on your system? Computers are fun because nothing breaks when you try things out – Pynchia Jul 29 '15 at 21:13
  • I think it may be required that `B` be a strict **prefix** of `A`, not just a subset. – Barmar Jul 29 '15 at 21:14
  • 1
    @Pynchia That will just tell you if it works in a specific case, not if it generalizes. – Barmar Jul 29 '15 at 21:14
  • 2
    @Jashaszun While padding is implementation-dependent, I think the implementation has to be consistent about it. – Barmar Jul 29 '15 at 21:15
  • You can definitely do this if you replace the first fields of `A` with an actual instance of `B`. I'm not sure about separate fields. – Quentin Jul 29 '15 at 21:20

3 Answers3

1

It is not guaranteed that struct A's layout starts off the same as struct B's layout.

However, if and only if they were both members of a union:

union X
{
    struct A a; 
    struct B b;
};

then it is guaranteed that the common initial sequence has the same layout.

I've never heard of any compiler that would lay out a struct differently if it detected that the struct were a member of a union, so in practice you should be safe!

Community
  • 1
  • 1
M.M
  • 138,810
  • 21
  • 208
  • 365
  • It would be the same if `struct B` is just the first field of `struct B`. Both way, you would need another field name and for this it _is_ guaranteed (and not just common practice). – too honest for this site Jul 29 '15 at 22:12
  • 1
    @Olaf I don't know what you are trying to say – M.M Jul 29 '15 at 22:12
  • `union X x; x.a.x[0];` would require to select the correct struct `.a`. `struct A { struct B b; ...} a; a.b.x[0]` ditto. – too honest for this site Jul 29 '15 at 22:15
  • 1
    @Olaf if `struct A` is defined that way, then it does not have a member `x[0]` – M.M Jul 29 '15 at 22:15
  • Right: it does have a member `b` with member `x`. The other way round, you have the wrapping union `x`. Neither way looks very elegant to me (no criticism) – too honest for this site Jul 29 '15 at 22:18
  • I still don't know what you are trying to say, `x.a.x[0]` is an error if `struct A` is like you suggested – M.M Jul 29 '15 at 22:18
  • It's like _you_ suggested actually. See my first code-fragment where I define the union. But nevermind. Sorry, I'm very tired and writing English is getting complicated for me now. I think I should just stop here and try to relax (can't sleep either right now). I'll see if I gt downvotes and might delete my answer then - some time you win, some time you loose. – too honest for this site Jul 29 '15 at 22:24
  • Well, it may be stronger than `in practice you should be safe`: imagine that I have the two `structs` defined in a header file, and use them independently in some compilation unit. Then, in another unrelated compilation unit (maybe a dynamically loaded shared object), I define a function that contains such a union. The only way the compiler have to guarantee the overlapping of fields inside some potential union is to use padding consistently, right? – lvella Jul 29 '15 at 22:29
  • The compiler could (in theory) analyze the whole program, see that there is no union used anywhere, and then decide on its struct layouts. But in practice that would be a nightmare for coders so I would expect no compiler to do it. – M.M Jul 29 '15 at 22:31
  • @Olaf, that is how you suggested. The way I asked, both `a` and `b` would have independent `x`, `y` and `z` members... – lvella Jul 29 '15 at 22:31
  • I would consider it implausible that the authors of the Standard intended that compiler authors would even think about trying to identify structures that weren't actually used as parts of unions but did have their addresses taken, and exempt such structures from the layout guarantees applicable to structures that are used within unions. Most of the usefulness of the Common Initial Sequence rule stems not from how things behave when they are members of unions, but rather from what it implies about the behavior of structures identified by pointers. – supercat Sep 12 '16 at 16:00
  • The C Standard was never designed to be subjected to modern "language-lawyering", and the authors acknowledge that it does not describe all the requirements for a useful implementation. They expected that if the easiest way to meet the Standard's requirements would behave usefully, they shouldn't need to forbid compiler writers from doing otherwise (since they wouldn't do so anyway). – supercat Sep 12 '16 at 16:17
  • @supercat if you take out the "within a union" clause, then there would be layout guarantees for all structures -- which is exactly the sort of thing standards writers avoid, because it may constrain future implementations that they cannot yet imagine. – M.M Sep 12 '16 at 22:31
  • @M.M: I fail to see the problem. If it becomes apparent that being able to rearrange structures would facilitate optimizations, compiler writers could either add a directive to indicate that a programmer doesn't care about the layout of a particular structure, or allow a command-line switch to select a non-conforming mode. The amount of complexity required to implement such switches or directives would be far less than the complexity required to conform ensure compliance with the current rules (BTW, the gcc 6.2 at godbolt generates bogus code in some cases where storage is used... – supercat Sep 12 '16 at 23:09
  • ...to hold different types at different times, even though all reads match the types with which data was written; allowing programmers to explicitly specify when optimizations should or should not be performed would reduce the likelihood of such bugs). Fundamentally, if a program runs acceptably fast, having a new compiler run the program the same way unless/until a programmer identifies what directives can be safely added and adds them, would seem much safer than having a new compiler apply new optimizations by default. I also think that while the authors of the Standard may have wanted... – supercat Sep 12 '16 at 23:15
  • ...to allow for implementations which have a compelling reason to use unusual structure layouts (e.g. data compatibility some existing language on the same platform that uses such a layout), I can't imagine anyone in 1989 seriously entertaining the idea that programmers targeting only commonplace implementations should have to jump through hoops to get the semantics that could be achieved easily in either Dennis Ritchie's 1974 C implementation or almost any of the platforms that were commonplace in 1989. – supercat Sep 12 '16 at 23:21
0

How about using struct B as an anonymous struct member of struct A. This requires, however, -fms-extensions for gcc (there should be a similar extension for VC as the name implies):

struct B {
    double x[3];
    double y[3];
    int z[3];
};

struct A {
    struct B;

    struct A *a;
    int b;
    struct A *c;
    unsigned d[10];
};

This allows to use the fields in struct A like:

struct A as;

as.x[2] = as.y[0];

etc. This guarantees identical layout (the standard allows no padding at the beginning of a struct, so the inner struct is guarantee to start at the same address as the outer) and struct A being cast-compatible to struct B.

Also:

struct A as;
struct B bs;
memcpy(&as, &bs, sizeof(bs));
too honest for this site
  • 12,050
  • 4
  • 30
  • 52
  • 1
    `(struct B)as = bs;` is illegal, there are no conversions of struct types – M.M Jul 29 '15 at 21:54
  • 2
    Your code doesn't contain any *anonymous struct member*. They are a struct with no tag, however `B` is a struct tag. – M.M Jul 29 '15 at 21:57
  • @MattMcNabb: Sorry, edited. Not sure about strict aliasing now, however. Iat least 'memcpy` should be safe. – too honest for this site Jul 29 '15 at 21:58
  • 1
    Doesn't work, gcc complains: `warning: declaration does not declare anything struct B;`, and then: `error: ‘struct A’ has no member named ‘x’`. Clang says something similar. – lvella Jul 29 '15 at 22:01
  • in fact `struct B;` is illegal in ISO C . It's a Microsoft extension. In gcc for windows it is enabled by default (presumably for compatibility with windows headers), to get conforming behaviour use `-fno-ms-extensions` – M.M Jul 29 '15 at 22:03
  • @MattMcNabb: edited. should work now. Sorry, tough day. – too honest for this site Jul 29 '15 at 22:10
  • @MattMcNabb: did you compile with `fms-extensions`? I tried and it works. – too honest for this site Jul 29 '15 at 22:16
  • @Olaf, oh, I see now that you are suggesting to use extensions – M.M Jul 29 '15 at 22:17
  • @MattMcNabb: I could make the second sentence bold, if you think that makes would be helpful;-). I actually do not understand the commitee did not make that standard. Imo this would have been the a logical step; without this, anonymous structs are of not much use imo. But **with** it they are ver powerful. – too honest for this site Jul 29 '15 at 22:19
0

I do not think the Standard would prohibit an implementation from including so much more padding in s_a than s_b that the former is actually larger even though its members are a subset of s_b's. Such behavior would be very weird, and I can't think of any reason why a compiler would do such a thing, but I don't think it would be prohibited.

If the number of bytes copied is the lesser of sizeof s_a and sizeof s_b, then the memcpy operation will be guaranteed to copy all of the common fields, but would not necessarily leave the later fields of s_b undisturbed. On a typical machine, if the declarations had been:

struct A { uint32_t x; char y; };
struct B { uint32_t x; char y,p; uint16_t q; };

the first structure would contain five bytes of data and three bytes of padding, while the second would contain eight bytes of data with no padding. Using memcpy as shown in your code would copy the padding from s_a over the data in s_b.

If you need to copy the initial structure members while leaving the balance of the structure undisturbed, you should compute add offset and size of the last member of interest, and use that as the number of bytes to copy. In the example I give above, the offset of y would be 4, and the size would be 1, so the memcpy would thus ignore parts of the structure that are used as padding in A but might hold data in B.

supercat
  • 77,689
  • 9
  • 166
  • 211