7

I am a little confused about how bytes are ordered in a struct.

Let's say I have the following struct:

struct container {
    int myint;
    short myshort;
    long mylong;
};

Now, I want to initialize a variable of type struct container just like the following, except that I want to do it using an array.

struct container container1 = {.myint = 0x12345678,
                               .myshort = 0xABCD,
                               .mylong = 0x12345678};

Assume sizeof int and long are 4, and that of short is 2.

Assume there is no padding.

How would then the layout of the 10 bytes of the struct be?

Does it depend on the endianness?

Would be it be like:

0x12345678 ABCD 12345678

or like:

0x78563412 CDAB 78563412

What I want to do is: I have the following char array:

char buffer[10] = {0};

I want to manually fill this array with data and then memcpy to the struct.

Should I be doing[1]:

buffer[0] = 0x12345678 & 0xFF;
buffer[1] = 0x12345678 >> 8 & 0xFF;
buffer[2] = 0x12345678 >> 16 & 0xFF;
buffer[3] = 0x12345678 >> 24 & 0xFF;
...
buffer[9] = 0x12345678 >> 24 & 0xFF;

or should it be[2]:

buffer[0] = 0x12345678 >> 24 & 0xFF;
buffer[1] = 0x12345678 >> 16 & 0xFF;
buffer[2] = 0x12345678 >> 8 & 0xFF;
buffer[3] = 0x12345678 & 0xFF;
...
buffer[9] = 0x12345678 & 0xFF;

before I do my memcpy like:

memcpy(&container1, buffer, sizeof(container1);

And, if I am writing to an array and copying to struct, Is it portable across systems, especially with regards to endianness?

EDIT: Does [1] work on a little endian machine and [2] on a big endian?

Arjun Sreedharan
  • 11,003
  • 2
  • 26
  • 34
  • 5
    No, it's not portable. Yes, it depends on endianness. And the assumptions about padding and the sizeof the types will also lead to portability issues. – user3386109 Nov 02 '15 at 06:02
  • To emphasize *no portable way*, that means no portable way from compiler to compiler on the same OS, much less portable from OS to OS. – David C. Rankin Nov 02 '15 at 06:16
  • Of course it depends on endianness! After your "there is no padding" assumption the question no longer has anything to do with struct types. It is simply about representing integers in memory. – AnT stands with Russia Nov 02 '15 at 06:19
  • Is it even guaranteed that `myint` and `myshort` is placed before `mylong`? It looks like it might be more efficient to order them `mylong`, `myint`, `myshort` due to alignment issues - it would be a pity if the implementation wasn't allowed to do this optimization. – skyking Nov 02 '15 at 09:31

2 Answers2

3

Does it depend on the endianness?

Yes it does depends on the endianness of the machine. So your logic will change depending on the endianness of the machine.

There is no portable way* to do it because of structure padding. Although different compilers do provide custom ways to disable struct padding. Check Force C++ struct to not be byte aligned.

  • You can add a static_assert (requires C11 support) just to be sure that your code doesn't compiles unless your struct is tightly packed. You won't have portable code but you still can be sure that if your code compiles, it will behave correctly.

    static_assert(sizeof(container) == sizeof(int) + sizeof(short) + sizeof(long));
    
Community
  • 1
  • 1
bashrc
  • 4,725
  • 1
  • 22
  • 49
  • The ways that different compilers provide of disabling structure padding are not portable — each compiler has its own, unless it is emulating another for compatibility. You might need to note that `static_assert` requires a C11 compiler (or a non-standard extension in compilers conforming to C99 or C90). – Jonathan Leffler Nov 02 '15 at 06:22
  • @JonathanLeffler I have mentioned that in my answer as well as the answers on the question that I have linked mention that the solution OP seeks cannot be portable. Edit to emphasize it even further. – bashrc Nov 02 '15 at 06:26
  • 2
    I have mixed views on linking to a C++-only question in a C-only question. It is at least a warning flag. – Jonathan Leffler Nov 02 '15 at 06:33
  • @JonathanLeffler There is a way to do `static_assert` with a macro in C. See the last part of my answer. It's boilerplate I've used for years. I've seen other variants that use something else than **switch**. IMO, one more piece of C++ er, _stuff_ that doesn't need to be an _intrinsic_ – Craig Estey Nov 02 '15 at 09:00
  • @CraigEstey With C11 you dont need to give your own version. http://en.cppreference.com/w/c/error/static_assert – bashrc Nov 02 '15 at 09:05
  • @CraigEstey: yes, there are ways to do it in older versions of C, with some consequential restrictions. C11 provides it as standard (or, strictly, provides `_Static_assert` and requires `` for the macro `static_assert`). – Jonathan Leffler Nov 02 '15 at 09:06
  • @bashrc Well, you sort of do. Because your code may have to run on an old compiler. That is, you work for a company that has a client, and the client insists on using RHEL 5 and you're not permitted to use/install a newer compiler (e.g. Mentor Graphics actually has that problem with their clients. Their clients are chip companies like Intel and the software controls a fab line. They are very conservative because a single day of downtime due to the smallest bug costs $100,000,000). – Craig Estey Nov 02 '15 at 09:34
1

There is another problem, that of element alignment within your struct.

Your struct has gaps for alignment. The real layout is as if you did:

struct container {
    int myint;
    short myshort;
    char __pad__[2];  // padding to align mylong to 4/8 byte boundary
    long mylong;
};

What about using a union:

union {
    struct container vals;
    char buf[10];
};

But, why do you want to do this? For almost any scenario I can think of, there is probably a cleaner way to get the effect.

When you say array, do you mean you'd like to init an array of your structs? This can be done:

struct container conts[3] = {
    { .myint = 1, .myshort = 2, .mylong = 3 },
    { .myint = 4, .myshort = 5, .mylong = 6 },
    { .myint = 7, .myshort = 8, .mylong = 9 }
};

BTW, there is a way to do static_assert in C:

// compile time assertion -- if the assertion fails, we will get two case 0:'s
// which will cause the compiler to flag this
#define static_assert(_assert) \
    do { \
        switch (0) { \
        case (_assert): \
            break;
        case 0: \
            break; \
        } \
    } while (0)
Craig Estey
  • 30,627
  • 4
  • 24
  • 48