7

I know there is padding in struct (example from this post)

 struct A   -->8 bytes
 {
    char c;
    char d;
 //2 padding here
    int i;
 };
 struct B  -->12 bytes
 {
     char c;
 //3 padding here
    int i;
    char d;
 //3 padding here
 };

Now, I don't understand following example:

 typedef struct {  -->**shouldn't it be 12 bytes**
    int a;
    char *str;
 } TestS;

 TestS s;

int main(int argc, char *argv[]) {

   printf("An int is %lu bytes\n", sizeof( int )); -->4
   printf("A Char * is %lu bytes\n", sizeof( char *)); -->8
   printf("A double is %lu bytes\n", sizeof( double )); -->8

   printf("A struct is %lu bytes\n", sizeof s); -->why 16?

   return 0;

 }

First I thought it may aligning to 8*N byte (for I use ubuntu-64), so I try more structs.

  typedef struct {
   int i;
   char *str;
  } stru_12;


  typedef struct {
    int i;
    char *str;
    char c;
  } stru_13;

 typedef struct {
    int i;
    char str[7];
 } stru_11;

 typedef struct {
   char *str;
   double d;
 } stru_16;

  stru_12 test12;
  stru_13 test13;
  stru_11 test11;
  stru_16 test16;

int main (int argc, char *argv[]) {
    printf("A test12 is %lu bytes, address is %p\n", sizeof test12, &test12);
    printf("A test13 is %lu bytes, address is %p\n", sizeof test13, &test13);
    printf("A test11 is %lu bytes, address is %p\n", sizeof test11, &test11);
    printf("A test16 is %lu bytes, address is %p\n", sizeof test16, &test16);
}

Result:

A test12 is 16 bytes, address is 0x601060

A test13 is 24 bytes, address is 0x601090

A test11 is 12 bytes, address is 0x601080

A test16 is 16 bytes, address is 0x601070

Sorry for being so long.

My question is:

  • Why test12 (int + char*) is 16 bytes and test13 (int + char * + char) is 24?(it seems that 8*N is favored, but 12 bytes is allowed )

  • Why the differences of the addresses of structs is 16 addressing unit (more padding?)?

For your use:

cache_alignment : 64

address sizes : 36 bits physical, 48 bits virtual

Ubuntu 14.04.1 LTS x86_64

Community
  • 1
  • 1
Tony
  • 5,972
  • 2
  • 39
  • 58
  • 1
    This code is very noisy. Could you remove all the typedefs and the variables, and instead use `sizeof(struct stru_12)` etc? Less visual clutter. – Kerrek SB Jul 31 '14 at 08:37
  • Thanks for advice but I need the address. Any solution? – Tony Jul 31 '14 at 08:38
  • 1
    24 == 8*3. 8 is the alignment unit, not 12. Difference between addresses is meaningless unless they are addresses of elements of the same array. – n. m. could be an AI Jul 31 '14 at 08:44
  • 1
    @Tony, Generally, each memeber should be aligned appropriately, so as to the whole struct object because it can be used in an array. – Eric Z Jul 31 '14 at 08:44
  • @Tony: The address is relatively meaningless, so I would just not bother with it, but if you like to keep it as it is, that's fine. It's your call. (And your question, of course!) – Kerrek SB Jul 31 '14 at 09:13

2 Answers2

4

The second question is implementation-defined (and in reality, so is the first, but I'll show you why you're getting the spacing you're getting regardless). Your platform is apparently 64-bit, and as such your data pointers are likewise (64-bit). With that, we peek at the structures.


stru_12

typedef struct 
{
   int i;
   char *str;
} stru_12;

This is aligned so str always falls on a 8-byte boundary, including in a contiguous sequence (an array). To do that, 4 bytes of padding are introduced between i and str.

0x0000 i    - length=4
0x0004 pad  - length=4
0x0008 ptr  - length=8
======================
Total               16

An array of these will always have ptr on an 8-byte boundary provided the array starts on said-same (which it will). Because the addition of padding between i and str also brought the structure size to a multiple of 8, no additional padding is required beyond this.


stru_13

Now, consider how that is also achieved with this:

typedef struct 
{
    int i;
    char *str;
    char c;
} stru_13;

The same padding will apply between i and str to once-again place str on an 8-byte boundary, but the addition of c complicates things. To accomplish the goal of pointers always residing on 8-byte boundaries (including a sequence/array of these structures) the structure needs tail padding, but how much? Well, I hope it is obvious the overall structure size needs to be a multiple of 8 to ensure any embedded pointers (which are also on multiples of 8) are properly aligned. In this case, seven bytes of tail-padding are added to bring the size to 24 bytes:

0x0000 i    - length=4
0x0004 pad  - length=4
0x0008 ptr  - length=8
0x0010 c    - length=1
0x0011 pad  - length=7
======================
Total               24

stru_13 (part deux)

So try this. What might you think the same fields we had before, but ordered differenty, will result with:

typedef struct 
{
    char *str;
    int i;
    char c;
} stru_13;

Well, we know we want str on an 8-byte boundary and i on a 4-byte boundary, and frankly couldn't care less about c (always a brides-maid):

0x0000 ptr  - length=8
0x0008 i    - length=4
0x000c c    - length=1
0x000d pad  - length=3
======================
Total               16

Run that though your test program and you'll see it pans out as we have above. It reduces to 16-bytes. All we did was change the order to a space-friendlier layout that still supported our requirements, and we reduced the default representation by 8 bytes (one third of the original structure with the prior layout). To say that is an important thing to take away from all this would be an understatement.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • This is a nice answer, but what's with the 12 byte test11 case? – martin Jul 31 '14 at 08:58
  • 2
    @martin `stru_11` has no pointers or doubles so the 8-byte boundary goes out the door, but ideal access to address `i` (a 32bit `int`) should put it on a 4-byte boundary, and again, maintain that in a sequence. To accomplish that a single extra yet of padding is added to the structure tail. The result is a 12-byte length, and `i` always falls on a 4-byte boundary (assuming, of course, it starts on one, which it will). Play around with swapping `long` and `short` in for `i`'s type and see what happens. – WhozCraig Jul 31 '14 at 09:01
  • @martin it is also worth noting things may change considerably when you order you members from *largest* to *smallest* in your structure. That especially, is worth playing with. – WhozCraig Jul 31 '14 at 09:06
  • So it must make sure that every members in a struct can be aligned appropriately in an struct array, right? – Tony Jul 31 '14 at 09:29
  • 1
    @Tony the point is you don't *have* to. The compiler is doing it for you. In most cases you're fine as-is. If you have the need to squeeze more items per page to better utilize the prefetcher and cache lines, take the time to make well-founded adjustments. If you're writing fresh-code, *ideally* make it a habit to order things so their have a decent compact representation, but don't let stressing premature optimization ruin an otherwise perfectly good work-day. Knuth will slap you all the way from Stanford if you do. – WhozCraig Jul 31 '14 at 09:33
  • Thank you for your answer and comments. One irrelevant question: Does Knuth say something about this in TAOCP? (so interesting scence:) – Tony Jul 31 '14 at 09:45
  • 1
    If you mean the quote concerning premature optimization being the root of all evil, no. It was from a paper he wrote in 1974, ["Structured Programming With GOTO Statements"](http://cs.sjsu.edu/~mak/CS185C/KnuthStructuredProgrammingGoTo.pdf), page 268 (how's *that* for irony). – WhozCraig Jul 31 '14 at 09:55
  • Thanks a lot for your link and patience. – Tony Jul 31 '14 at 10:05
3

Pointers must be correctly aligned for the CPU to use them.

In C/C++ structures must work in arrays, so the end of a structure is padded in that regard.

struct A
{
    char a;
    // 7 bytes of padding
    char *p;
    char b;
    // 7 bytes of padding
};

A array[3];  // the last padding is important to do this

In such a structure, p must be aligned so the processor can read the pointer without generating an error (32 bit INTEL processors can be setup to no err on unaligned data, but that's not a good idea: it is slower and it would often skip on errors that are bugs. 64 bit processors have more limits in that arena.)

So since you are on 64 bit, the pointer is 8 bytes and the alignment just before the pointer must be a multiple of 8.

Similarly, the total size of the structure must be a multiple of the largest type in the structure, here it is 8, so it pads at the end to the next 8 bytes.

There are really only 2 cases where you should worry about that though: (1) creating a structure to be saved in a file and (2) creating a structure that you will allocate in very large numbers. In all other cases, don't worry about it.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156