0

In a c structure longer than 16bytes there becomes a problem on the 16th byte if it is a 2 byte element. Let me explain with code:

struct test{
    unsigned int a : 16;
    unsigned int b : 16;
    unsigned int c : 16;
    unsigned int d : 16;
    unsigned int e : 16;
    unsigned int f : 16;
    unsigned int g : 16;
    unsigned int h : 8;
    unsigned int i : 16;
};

int main(int argc, char *argv[]){
    //bytes 0, 1, 16, 17 are all equal to 0x31 
    unsigned char *data = "1134567890123451189";
    struct test *testStructure = (struct test*) data;

    printf("%X %X\n", testStructure->a, testStructure->i);
    return 0;
}

Output:

3131 3831

Why is 'i' not equal to 'a'? 'i' skipped byte 16 and used byte 17 and 18 instead. What is going on here?

Brad
  • 21
  • 1
  • 7
    The cast is completely wild, it invokes undefined behavior because it violates strict aliasing. – Lundin May 22 '17 at 12:35
  • 1
    bytes 0,1,15 and 16 are '1' (0x31), so not byte 17, which is an '8'. Also, this is not "just" a struct, but a bitfield. – marcolz May 22 '17 at 12:39
  • See here https://stackoverflow.com/questions/1490092/c-c-force-bit-field-order-and-alignment/1490135, what you're trying to do is (as far as I know) really not portable. You'll have to look at your compiler's docs to figure out what is happening exactly. – Mat May 22 '17 at 12:45
  • the reason your seeing a 'skipped' byte is because each field has to be 'aligned' on the proper boundary. so, after 8 bit field is a 'dummy' alignment byte so the final 16 bit field is properly aligned. Also, the 'endian' of your hardware architecture is such that each 16 bit field has the bytes in the order seen in the char array, but the 16 bit fields are read right most byte first. if you had started the char array `data` with "12" the 'endianness' would be much more obvious (or if your not familiar with 'endianness' that much more confusing – user3629249 May 22 '17 at 21:17

1 Answers1

3

I wouldn't do it this way because:

  1. Bit field packing is implementation-defined. See C99 §6.7.2.1p10: "The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined"
  2. This violates strict aliasing rules.

What actually happens is most likely in this case i is aligned on 4 bytes (the size of int).

You can disable alignment and it should give you a better result:

#pragma pack(1)
struct test{
    unsigned int a : 16;
    unsigned int b : 16;
    unsigned int c : 16;
    unsigned int d : 16;
    unsigned int e : 16;
    unsigned int f : 16;
    unsigned int g : 16;
    unsigned int h : 8;
    unsigned int i : 16;
};
#pragma pack()

int main(int argc, char *argv[]){
    //bytes 0, 1, 15, 16 are all equal to 0x31 
    unsigned char *data = "1134567890123451189";
    struct test *testStructure = (struct test*) data;

    printf("%X %X\n", testStructure->a, testStructure->i);
    return 0;
}

On clang, x86_64 it prints:

3131 3131

However, this code is still illegal, it is not guaranteed to work this way everywhere.

To solve the bitfield issue try not to use bitfields (fortunately in your particular case it's possible). But there is unfortunately no easy solution to the aliasing problem; most people who rely on type punning simply compile with -fno-strict-aliasing (including Linux kernel guys). Others jump through the hoops using unions, which strictly speaking is still illegal but is common idiom and is well-supported by most compilers:

#include <stdio.h>
#include <stdint.h>

#pragma pack(1)
struct test{
    uint16_t a;
    uint16_t b;
    uint16_t c;
    uint16_t d;
    uint16_t e;
    uint16_t f;
    uint16_t g;
    uint8_t  h;
    uint16_t i;
};
union u{
    struct test t;
    char str[17];
};
#pragma pack()

int main(int argc, char *argv[]){
    //bytes 0, 1, 15, 16 are all equal to 0x31 
    char *data = "1134567890123451189";
    union u *testStructure = (union u*) data;

    printf("%X %X\n", testStructure->t.a, testStructure->t.i);
    return 0;
}
rustyx
  • 80,671
  • 25
  • 200
  • 267
  • Nice explanation of why OP approach was _not_ correct, How would you explain _proper_ bit field initialization? – ryyker May 22 '17 at 14:02
  • 1
    @ryyker *How would you explain proper bit field initialization?* Does such a thing even exist? Bit fields are so implementation-specific and non-portable I don't think there's any real "proper" way to initialize them in any portable way. And if it's not portable, it doesn't matter if it's proper. – Andrew Henle May 22 '17 at 14:17
  • Thanks, that is a very good explanation. Much appreciated. – Brad May 22 '17 at 14:34