In an old program I serialized a data structure to bytes, by allocating an array of unsigned char, and then converted ints by:
*((*int)p) = value;
(where p
is the unsigned char*
, and value
is the value to be stored).
This worked fine, except when compiled on Sparc where it triggered exceptions due to accessing memory with improper alignment. Which made perfect sense because the data elements had varying sizes so p
quickly became unaligned, and triggered the error when used to store an int value, where the underlying Sparc instructions require alignment.
This was quickly fixed (by writing out the value to the char-array byte-by-byte). But I'm a bit concerned about this because I've used this construction in many programs over the years without issue. But clearly I'm violating some C rule (strict aliasing?) and whereas this case was easily discovered, maybe the violations can cause other types of undefined behavior that is more subtle due to optimizing compilers etc. I'm also a bit puzzled because I believe I've seen constructions like this in lot of C code over the years. I'm thinking of hardware drivers that describe the data-structure exchanged by the hardware as structs (using pack(1) of course), and writing those to h/w registers etc. So it seems to be a common technique.
So my question is, is exactly what rule was violated by the above, and what would be the proper C way to realize the use-case (i.e. serializing data to an array of unsigned char). Of course custom serialization functions can be written for all functions to write it out byte-by-byte but it sounds cumbersome and not very efficient.
Finally, can ill effects (outside of alignment problems etc.) in general be expected through violation of this aliasing rule?