Struct member alignment -- different sizeof using 16-bit and 32-bit compiler

Question

I have a structure used to contruct messages to a control board I need to maintain software compatibility between a C167 16-bit Keil compiler and a 32-bit Tricore gcc compiler.

typedef struct
{
    unsigned char new_weld_status[2];
    UINT32 new_weld_count;
    UINT16 new_weld_fail_count;
} NEW_PULSE_DATA;

The array new_weld_status[2] takes up 2 bytes on the 16-bit compiler but 4 bytes on the 32-bit compiler. I was thinking of replacing all the new_weld_status[2] with a union when compiling with gcc. But is there a switch I can use for gcc that makes the chars fits/aligns in 2 bytes?

Thanks

How do you check the size of `new_weld_status`? Do you perhaps check the size of the whole structure? Then this is a duplicate of http://stackoverflow.com/questions/119123/why-isnt-sizeof-for-a-struct-equal-to-the-sum-of-sizeof-of-each-member — Some programmer dude, May 16 '18 at 21:57
@Someprogrammerdude If I had to guess (which I guess I have to since we don't have [MCVE] ), I'd say OP is taking the sizeof a pointer, perhaps decayed from an array. — Christian Gibbons, May 16 '18 at 22:03
Check out [The Lost Art of C Structure Packing](http://www.catb.org/esr/structure-packing/) for details on why this happens and how to avoid it. — dbush, May 17 '18 at 02:58
No the array takes up 2 bytes on both systems. But on the 32 bit system you get padding inserted after the array. The solution is to not write code that depends on a certain memory layout of the struct. It is not obvious to me why this code would not be portable, unless you do stupid things with it like `memcpy(&the_struct, something, 8)`, in which case the solution is to correct the code that does stupid things. — Lundin, May 17 '18 at 08:28
@Lundin: Code that depends on a certain memory layout can be perfectly fine -- not only that, it may very well be the optimal solution in some cases. About `memcpy`, please note that it is legal to call it like that (and the easiest way to avoid strict aliasing issues). — Acorn, May 17 '18 at 12:18
@Acorn Using structs to achieve a certain memory layout is non-portable. — Lundin, May 17 '18 at 12:34
@Lundin: OP has not asked for portable code -- quite the opposite. — Acorn, May 17 '18 at 12:44

ChuckCottrill · Answer 1 · 2018-05-16T23:27:43.393

Note that your structure layout creates the problem on a 32-bit system. Many (most) 32-bit CPU architectures require 4-byte alignment for 32-bit words, thus the new_weld_count requires 'padding' to provide proper memory alignment.

typedef struct
{
    unsigned char   new_weld_status[2];   //a
    //char padding_1[2]; //hidden padding
    UINT32  new_weld_count;               //a
    UINT16  new_weld_fail_count;          //a
} NEW_PULSE_DATA;

The following redefinition of your structure completely avoids the problem.

typedef struct
{
    UINT32  new_weld_count;               //a
    UINT16  new_weld_fail_count;          //a
    unsigned char   new_weld_status[2];   //a
} NEW_PULSE_DATA;
NEW_PULSE_DATA ex_PULSE_DATA;

However, the above approach is not the approach typically to transport struct(ured) data across networks/over message transports. A more common and much better approach is to use a serialization/deserialization layer (aka marshalling) to place the structures into 'over the wire' formats. Your current approach is conflating the in-memory storage and addressing with the communication format.

//you need to decide on the size of wire format data,
//Both ends of the protocol must agree on these sizes,
#define new_weld_count_SZ sizeof(ex_PULSE_DATA.new_weld_count)
#define new_weld_fail_count_SZ sizeof(ex_PULSE_DATA.new_weld_fail_count)
#define new_weld_status_SZ sizeof(ex_PULSE_DATA.new_weld_status)

//Then you define a network/message format
typedef struct
{
    byte new_weld_count[new_weld_count_SZ];
    byte new_weld_fail_count[new_weld_count_SZ];
    byte new_weld_status[new_weld_count_SZ];
} MESSAGE_FORMAT_PULSE_DATA;

Then you would implement serialization & deserialization functions on both ends of the transport. The following example is simplistic, but conveys the gist of what you need.

byte*
PULSE_DATA_serialize( MESSAGE_FORMAT_PULSE_DATA* msg, NEW_PULSE_DATA* data )
{
    memcpy(&(msg->new_weld_count), data->new_weld_count, new_weld_count_SZ);
    memcpy(&(msg->new_weld_fail_count), data->new_weld_fail_count, new_weld_fail_count_SZ);
    memcpy(&(msg->new_weld_status), data->new_weld_status, new_weld_status_SZ);
    return msg;
}

NEW_PULSE_DATA*
PULSE_DATA_deserialize( NEW_PULSE_DATA* data, MESSAGE_FORMAT_PULSE_DATA* msg )
{
    memcpy(data->new_weld_count, &(msg->new_weld_count), new_weld_count_SZ);
    memcpy(data->new_weld_fail_count, &(msg->new_weld_fail_count), new_weld_fail_count_SZ);
    memcpy(data->new_weld_status, &(msg->new_weld_status), new_weld_status_SZ);
    return msg;
}

Note that I have omitted the obligatory network byte order conversions, because I assume your have already worked out your byte order issues between the two cpu domains. If you have not considered byte-order (big-endian vs. little-endian), then you need to address that issue as well.

When you send a message, the sender does the following,

//you need this declared & assigned somewhere
NEW_PULSE_DATA data;
//You need space for your message
MESSAGE_FORMAT_PULSE_DATA msg;
result = send(PULSE_DATA_deserialize( &data, &msg ));

When you receive a message, the recipient does the following,

//recipient needs this declared somewhere
NEW_PULSE_DATA data;
//Need buffer to store received data
MESSAGE_FORMAT_PULSE_DATA msg;
result = receive(&msg,sizeof(msg));
//appropriate receipt checking here...
PULSE_DATA_deserialize( &data, &msg );

I can not change either method the receiving WinCE machine expects structure in order so all the 16bit machines would need new code but I am going to rewrite all the code and switch to Win embedded arm so will keep this post thanks — PinGNU, May 17 '18 at 08:12
While these are good points to know about, the question is not about reordering the struct or marshalling. — Acorn, May 17 '18 at 09:28
@PinGNU I guess that reveiving machine is a third one in the scenario (compared to the C167 and the TriCore mentioned in the question). You don't need to do anything with that: the binary format used on the communications protocol is there, settled. You only need to get the C167 & the TriCore to adhere to it by appropriate code which generates the right binary stream on both. To solve this problem, serializing & deserializing is a proper solution (on the C167 / TriCore end). — Jubatian, May 17 '18 at 11:16
@Acorn No the question is about "the array takes up 4 bytes" which is first of all incorrect, and second the padding should not matter. So the actual problem is the OP doing something icky with the struct, such as sending the whole struct as a data protocol, which comments indicate. And which is always non-portable bad practice. The correct solution is to serialize the struct. — Lundin, May 17 '18 at 11:43
@Lundin: No, sorry, the question is literally: "*is there a switch I can use for gcc that makes the chars fits/aligns in 2 bytes?*". Portability, endianness, unaligned accesses and other issues are related and important for many projects -- but do not answer the question at hand. On top of that, OP might very well be in a very tight environment where it is needed to minimize code size and/or cycles; and therefore extra handling may not be optimal. — Acorn, May 17 '18 at 11:56
@Acorn: There is no necessity serialisation uses a single instruction more than such bad code. In the worst case, provide a set of serialisation functions hand-optimised. But that would be the very last step if there **truely** is a performance/space issue for a particular platform. Until then, leave optimisations to the compiler and write clean, maintainable code. — too honest for this site, May 17 '18 at 16:28
@Olaf: Again, all that is orthogonal to the question. Also, please don't put words in my mouth. I never said nor implied that OP should not write maintainable code; nor that OP should optimize prematurely the code. — Acorn, May 17 '18 at 17:19
@Olaf: Excuse me? Please quote where I have state anything like that. About the rest: do you think we should avoid answering questions properly because someone *might* misunderstand something? If you believe SO (or SE) is meant for answers to beginners exclusively, I am sorry, but you are deeply mistaken. — Acorn, May 17 '18 at 18:52

score 4 · Answer 2 · answered May 16 '18 at 22:32

Union wouldn't change the members alignment inside a struct. You are interested in padding. The compiler may insert any number of bytes/bits between struct members to satisfy alignment requiremens. On gcc compatible compilers you may use __attribute__((__packed__)) as Acorn already pointed out, but this does not take care of endianess. The most compatible version between platforms (including platforms with different alignment and different endianess) would be to use (sadly!) get/set functions that look like this:

typedef struct {
    unsigned char data[2+4+2];
} NEW_PULSE_DATA;

unsigned char NEW_PULSE_DATA_get_new_weld_status(NEW_PULSE_DATA *t, size_t idx) {
    return t->data[idx];
}

void NEW_PULSE_DATA_set_new_weld_status(NEW_PULSE_DATA *t, size_t idx, unsigned char value) {
    t[idx] = value;
}

UINT32 NEW_PULSE_DATA_get_new_weld_count(NEW_PULSE_DATA *t) {
    return (UINT32)t->data[2]<<24
        | (UINT32)t->data[3]<<16 
        | (UINT32)t->data[4]<<8 
        | (UINT32)t->data[5];
}

void NEW_PULSE_DATA_set_new_weld_count(NEW_PULSE_DATA *t, UINT32 val) {
    t->data[2] = val>>24;
    t->data[3] = val>>16;
    t->data[4] = val>>8;
    t->data[5] = val;
}

UINT16 NEW_PULSE_DATA_get_new_weld_fail_count(NEW_PULSE_DATA *t) {
    return (UINT16)t->data[6]<<8
        | (UINT16)t->data[7];
}

void NEW_PULSE_DATA_set_new_weld_fail_count(NEW_PULSE_DATA *t, UINT16 val) {
    t->data[6] = val>>8;
    t->data[7] = val;
}

This is the only "good" way of being 100% sure, that NEW_PULSE_DATA looks exactly the same on different platforms (at least on platforms with the same number of bits per char/CHAR_BIT value). However sizeof(NEW_PULSE_DATA) may be still different between platforms, because compiler may insert padding on the end of the struct (after the last member of the structure). So you may want to change NEW_PULSE_DATA type to be just an array of bytes:

typedef unsigned char NEW_PULSE_DATA[2+4+2];

unsigned char NEW_PULSE_DATA_get_new_weld_status(NEW_PULSE_DATA t, size_t idx) {
    return t[idx];
}

unsigned char NEW_PULSE_DATA_set_new_weld_status(NEW_PULSE_DATA t, size_t idx, unsigned char value) {
    t[idx] = value;
}

UINT32 NEW_PULSE_DATA_get_new_weld_count(NEW_PULSE_DATA t) {
    return (UINT32)t[2]<<24
        | (UINT32)t[3]<<16 
        | (UINT32)t[4]<<8 
        | (UINT32)t[5];
}

void NEW_PULSE_DATA_set_new_weld_count(NEW_PULSE_DATA t, UINT32 val) {
    t[2] = val>>24;
    t[3] = val>>16;
    t[4] = val>>8;
    t[5] = val;
}

UINT16 NEW_PULSE_DATA_get_new_weld_fail_count(NEW_PULSE_DATA t) {
    return (UINT16)t[6]<<8
        | (UINT16)t[7];
}

void NEW_PULSE_DATA_set_new_weld_fail_count(NEW_PULSE_DATA t, UINT16 val) 
{
    t[6] = val>>8;
    t[7] = val;
}

Just that: never use home-brew fixed width types. The standard provides `stdint.h` for good reasons - use it. — too honest for this site, May 17 '18 at 16:29

score 1 · Answer 3 · answered May 16 '18 at 22:13

1

For gcc and other compilers, you can use __attribute__((packed)):

This attribute, attached to struct or union type definition, specifies that each member (other than zero-width bit-fields) of the structure or union is placed to minimize the memory required.

answered May 16 '18 at 22:13

Acorn

24,970
5
40
69

4

`__attribute__((packed))` is [a good way to wind up with code that doesn't run](https://www.google.com/search?q=attribute(packed)+sigbus) on machines that actually have alignment restrictions on data. Not every machine out there is x86-based, and given the machines here are likely 16- and 32-bit microcontrollers, I'd venture that `__attribute__(packed)` is more likely than not to cause problems. – Andrew Henle May 16 '18 at 23:06
I am getting 'packed' attribute ignored [-Wattributes] I am assuming with HiTec tricore gcc this is not available the 32 and 16 bit are both infinion endian is the same. – PinGNU May 17 '18 at 07:52
@AndrewHenle: I disagree. The specific question was about making the compiler pack the struct into the minimum amount of memory possible, which is the point of this attribute, not about portability or general good practices. *Precisely* because this is a question on embedded platforms, you are likely to be able to assume specific endianness/alignment and many other details, in order to save cycles, drain less battery, code size, etc. – Acorn May 17 '18 at 09:13
@PinGNU: What is the version of the compiler? At least since [2.95.3 `packed` and `aligned` are supported](https://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_4.html#SEC91). – Acorn May 17 '18 at 09:19
1

This does not answer the question. Why are you on about gcc? The OP is not using gcc but Keil. – Lundin May 17 '18 at 11:45
@Lundin: I am confused -- what do you mean? OP asked about gcc: "is there a switch I can use for **gcc** that makes the chars fits/aligns in 2 bytes?" – Acorn May 17 '18 at 11:46
1

Yeah and how exactly is `__attribute__` from gcc portable to Keil? – Lundin May 17 '18 at 12:32
@Lundin: Why does it matter if it is portable? I still don't follow your argument. OP is only using Keil for the 16-bit arch, not the 32-bit one; and is requesting explicitly a solution for gcc to make the struct match the layout of the compiler for the 16-bit arch. If he/she needs a single codebase, he/she can simply `#ifdef` a `#define PACKED ...` which is only used when compiling for the 32-bit arch. – Acorn May 17 '18 at 12:45
1

Packing should work on all platforms supporteing it. Nevertheless it can result in catastrophic performance, because not aligned members have to be accessed by multiple instructions which assemble/split the parts for each access to the data. The correct way is to use proper marshalling at a single point and otherwise use the natural alignment (i.e. avoid packing). Packing also does not solve endianess and encoding problems. – too honest for this site May 17 '18 at 16:33
1

@Olaf: Indeed, it may or may not result in bad performance; and the loss of performance may or may not matter for a particular system. We simply don't know enough to decide. Anyway, the question was about how to force the compiler to pack a struct, nothing else. Endianness and encoding have nothing to do with this, either. – Acorn May 17 '18 at 17:30
1

@Acorn: The recommended practice is not to use premat4eure optimisations. And the question was an XY problem. If the approach used is generally bad, it is very well the correct way to point that out, unless OPpoints out the correct way does not work. And yes, endianess and encoding are very well relevant. It is a generally bad protocol design to rely on internal details. And it almost always backfires in professional projects. – too honest for this site May 17 '18 at 18:22
@Olaf: As for endianness, encoding, etc. the question only states "*used to construct messages to a control board*" -- in my opinion, that is nowhere near enough detail to know the domain of the problem. This might very well be an XY problem, but you are assuming things in order to reply to me, instead of asking OP about them. – Acorn May 17 '18 at 18:41
@Olaf: As for "*bad protocol design to rely on internal details*": protocol specifications don't rely on "internal details", typically they state how they work, independent of any given architecture. What you probably mean is that **portable code** implementing a protocol shouldn't rely on architecture details as much as possible -- which is a good principle, but again, not that relevant for this question, IMO. – Acorn May 17 '18 at 18:42
@Acorn: No, I exactly meant a protocol should not rely on details of its implementation. That#s good practice by every reasonable and responsible protocol designer. Portable code is something different and on bare-metal often not completely achievable at the low level. What was it about "putting words" somewhere? Well, you have the last word(s) – too honest for this site May 17 '18 at 18:44
@Olaf: I am sorry, but you are not making any sense. By definition, a protocol cannot depend on details of "its implementation" because any given implementation (there may be several) is written according to the protocol specification. Whether an implementation of a protocol is portable or not is orthogonal to this, and precisely because of that, I am saying this discussion is off-topic. – Acorn May 17 '18 at 18:59

Struct member alignment -- different sizeof using 16-bit and 32-bit compiler

3 Answers3