0

I found code where bitfield is used for network messages. I would like to know what casting bitfield_struct data = *(bitfield_struct *)&tmp; exaclty does and how it's syntax work. Won't it violate the strict aliasing rule? Here is part of code:

typedef struct  
{
    unsigned      var1    : 1;
    unsigned      var2    : 13;
    unsigned      var3    : 8;
    unsigned      var4    : 10;
    unsigned      var5    : 7;
    unsigned      var6    : 12;
    unsigned      var7    : 7;
    unsigned      var8    : 6;

} bitfield_struct;

void print_data(u_int64_t * raw, FILE * f, int no_object)
{
    uint64_t tmp = ntohll(*raw);

    bitfield_struct data = *(bitfield_struct *)&tmp;

    ...
}
Lundin
  • 195,001
  • 40
  • 254
  • 396
Rafael
  • 25
  • 5
  • 2
    The problem is that even without strict aliasing, bit-fields make for highly non-portable code. The endianness, packing, byte order, everything about these fields depend on the specific compiler implementation. But yeah, it probably violates strict aliasing as well. Can you give more details such as the hardware platform and compiler? – th33lf Dec 13 '19 at 11:55
  • My compiler is VSCode (GCC) on Ubuntu 18.04. Compiled program works well but my problem is that i don't understand syntax behind this particular casting. What value is assigned to data? Is it a pointer? What it points? – Rafael Dec 13 '19 at 12:10
  • @th33lf *it probably violates strict aliasing as well* There's no "probably" about it - the posted code violates [strict aliasing](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule). – Andrew Henle Dec 13 '19 at 13:19

3 Answers3

1

Won't it violate the strict aliasing rule?

Yes it will, so the code invokes undefined behavior. It is also highly non-portable:

  • We don't know the size of the abstract item called "addressable storage unit" that the given system uses. It isn't necessarily 64 bits, so there could in theory be padding and other nasty things hidden in the bit-field. 64 bit unsigned is fishy.

  • Neither do we know if the bit-field uses the same bit-order as uint64_t. Nor can we know if they use the same endianess.

If individual bit (fields) of the uint64_t need to be accessed, I would recommend doing so using bitwise shifts, as that makes the code fully portable even between different endianess architectures. Then you don't need the non-portable ntohll call either.

Lundin
  • 195,001
  • 40
  • 254
  • 396
0

What it does (or attempts to do) is quite straightforward.

uint64_t tmp = ntohll(*raw);

This line takes the value in pointer raw, reverses the byte-order and copies it into temp.

bitfield_struct data = *(bitfield_struct *)&tmp;

This line reinterprets the data in temp (which was a uint64) as type bitfield_struct and copies it into data. This is basically the equivalent of doing:

/* Create a bitfield_struct pointer that points to tmp */
bitfield_struct *p = (bitfield_struct *)&tmp;

/* Copy the value in tmp to data */
bitfield_struct data = *p;

This because normally bitfield_struct and uint64 are incompatible types and you cannot assign one to the other with just bitfield_struct data = tmp;

The code presumably continues to access fields within the bitfield through data, such as data.var1.

Now, like people pointed out, there are several issues which makes this code unreliable and non-portable.

  1. Bit-fields are heavily implementation-dependent. Solution? Read the manual and figure out how your specific compiler variant treats bit-fields. Or don't use bitfields at all.

  2. There is no guarantee that a uint64_t and bitfield_struct have the same alignment. Which means there could be padding which can completely offset your expectations and make you end up with wrong data. One solution is to use memcpy to copy instead of pointers, which might let you this particular issue. Or specify packed alignment using the mechanism provided by your compiler.

  3. The code invokes UB when strict aliasing rules are applied. Solution? Most compilers will have a no-strict-aliasing flag that can be enabled, at a performance cost. Or even better, create a union type with bitfield_struct and uint64_t and use this to reinterpret between one and the other. This is allowed even with the strict-aliasing rules. Using memcpy is also legal, since it treats the data as an array of chars.

However, the best thing to do is not use this piece of code at all. As you may have noticed, it relies too much on compiler and platform specific stuff. Instead, try to accomplish the same thing using bit masks and shifts. This gets rid of all three problems mentioned above, without needing special compiler flags or having to face any real question of portability. Most importantly, it saves other developers reading your code, from having to worry about such things in the future.

th33lf
  • 2,177
  • 11
  • 15
-2

Right to left:

&tmp Take address of tmp

(bitfield_struct *)&tmp Address of tmp is address to data of type bitfield_struct

*(bitfield_struct *)&tmp Extract value out of tmp, assuming it's bitfield_struct data

bitfield_struct data = *(bitfield_struct *)&tmp; Store tmp to data, assuming that tmp is bitfield_struct

So it's just copy using extra pointers to avoid compilation errors/warnings of incompatible types.

What you may not understand is bit-addressing of structure.

unsigned var1 : 1; unsigned var2 : 13;

Here you will find some more info about it: https://www.tutorialspoint.com/cprogramming/c_bit_fields.htm

Mazi
  • 182
  • 8
  • 1
    @Rafael: Unfortunately, it is not helpful—it is wrong and misleading. While the code **appears** to treat the `uint64_t` as a `bitfield_struct` and to fetch bit fields from it, based on a naîve understanding of casts and pointer dereferencing, the behavior is in fact not defined by the C standard and is therefore not reliably. You **should not use** code like this if it is intended to be portable. Instead, the contents of a `uint64_t` may be copied into a `bitfield_struct` using `memcpy`. That provides a portable way of reinterpreting the bytes of a `uint64_t` to be a `bitfield_struct`. – Eric Postpischil Dec 13 '19 at 14:07
  • An example of correct reinterpretation would be `uint64_t tmp = ntohll(*raw); bitfield_struct data; memcpy(&data, &tmp, sizeof data);` (assuming the sizes involved in `ntohll`, `uint64_t`, and `bitfield_struct` match). In fact, although some C implementations may extend the C standard by defining such aliasing to work, the availability of a simple portable method means that the portable method should be used, and there is no point in relying on compiler extensions. – Eric Postpischil Dec 13 '19 at 14:11
  • @EricPostpischil Yeah I understand implications of this solution. Bitfields aren't best approach for portable solutions at all, so can you suggest another method with similar properties or where can i learn those?. I found other way of declaring specific bit size `const uint8_t BIT_BYTE = 0x1; const uint8_t BIT_HW = 0x2;` etc. but I presume that it won't change anything that you mentioned. – Rafael Dec 13 '19 at 14:40
  • 1
    @Rafael: My criticism of the answer is not about the lack of portability of bit fields. That is a separate issue. It is about the fact that the C standard does not support reinterpreting types via aliasing through converted pointers. – Eric Postpischil Dec 13 '19 at 18:29
  • @Rafael: To work with bits in a portable way, use the bitwise and shift operators on the bytes (in `unsigned char` or, once ordered as desired, in a `uint64_t` or other suitable unsigned integer type). – Eric Postpischil Dec 13 '19 at 18:31