Confuse about data address alignment

Question

I have a question about the answer provided by

@dan04. What is aligned memory allocation?

In particular, if I have something like this:

int main(){
      int num;  // 4byte
      char s;   // 1byte
      int *ptr;


}

If I have a 32 bit machine, do you think it would still be padding at the data by default?

In the previous question, it was asked about struct, and I am asking about variables declared in main.

update:

a = 2 bytes 
b = 4 bytes
c = 1 byte
d = 1 byte



 0 1 2 3 4 5 6 7
|a|a|b|b|b|b|c|d|  bytes
|       |       |  words

Why do you think these variables are stored anywhere at all? These are automatic variables and they may be optimised away, or only live in registers. Moreover, if ever stored in memory, the compiler can re-order them if it deems this useful etc. — Walter, Nov 19 '16 at 10:00

Support Ukraine · Accepted Answer · 2016-11-20T08:02:30.583

There are no rules for this. It depends on the implementation you are using. Further it may change depending on compiler options. The best you can do is to print the address of each variable. Then you can see how the memory layout is.

Something like this:

int main(void)
{
  int num; 
  char s;   
  int *ptr;

  printf("num: %p - size %zu\n", (void*)&num, sizeof num);
  printf("s  : %p - size %zu\n", (void*)&s, sizeof s);
  printf("ptr: %p - size %zu\n", (void*)&ptr, sizeof ptr);

  return 0;
}

Possible output:

num: 0x7ffee97fce84 - size 4
s  : 0x7ffee97fce83 - size 1
ptr: 0x7ffee97fce88 - size 8

Also notice that in case you don't take the address (&) of a variable, the compiler may optimize your code so that the variable is never put into memory at all.

In general the alignment is typically made to get the best performance out of the HW platform used. That typically imply that variables are aligned to their size or at least 4 byte aligned for variables with size greater than 4.

Update:

OP gives a specific layout example in the update and asks if that layout can/will ever happen.

Again the answer is: It is implementation dependent

So in principle it could happen on some specific system. That said I doubt that it will happen on any mainstream system.

There is another code example compiled with gcc -O3

int main(void)
{
  short s1;
  int i1;
  char c1;
  int i2;
  char c2;


  printf("s1: %p - size %zu\n", (void*)&s1, sizeof s1);
  printf("i1: %p - size %zu\n", (void*)&i1, sizeof i1);
  printf("c1: %p - size %zu\n", (void*)&c1, sizeof c1);
  printf("i2: %p - size %zu\n", (void*)&i2, sizeof i2);
  printf("c2: %p - size %zu\n", (void*)&c2, sizeof c2);

  return 0;
}

Output from my system:

s1: 0x7ffd222fc146 - size 2   <-- 2 byte aligned
i1: 0x7ffd222fc148 - size 4   <-- 4 byte aligned
c1: 0x7ffd222fc144 - size 1
i2: 0x7ffd222fc14c - size 4   <-- 4 byte aligned
c2: 0x7ffd222fc145 - size 1

Notice how the location in memory differs from the order variables was defined in the code. That ensures a good alignment.

Sorting by address:

c1: 0x7ffd222fc144 - size 1
c2: 0x7ffd222fc145 - size 1
s1: 0x7ffd222fc146 - size 2   <-- 2 byte aligned
i1: 0x7ffd222fc148 - size 4   <-- 4 byte aligned
i2: 0x7ffd222fc14c - size 4   <-- 4 byte aligned

So again to answer the update-question:

On most systems I doubt you'll see a 4 byte variable being placed at address xxx2, xxx6 or xxxa, xxxe. But still, systems may exist where that could happen.

I mean, why in the link, people say boundary would be 4 bytes? — Anni_housie, Nov 19 '16 at 08:31
@Anni_housie Well, it is much due to hardware architecture. For instance the cache is typically organized in lines with 2^N bytes. For performance it would be bad if a 4 byte variable had two bytes in one cache line and the next two bytes in the next cache line. Therefore we typically want 4 byte variables to be 4 byte aligned so that the variable can be held in a single cache line. This is just one example - there are more. But again - it is implementation dependent — Support Ukraine, Nov 19 '16 at 08:46
In many 32-bit architectures processors like to fetch 32 bits at a time. If the data item crosses a 4 byte (32-bit) boundary, the processor will have to make two fetches, which slows down a program. By keeping data aligned to 4 bytes, the processor only has to make one fetch. Doesn't really have anything to do with cache lines, since most cache lines are greater than 4 bytes. — Thomas Matthews, Nov 19 '16 at 08:51
@ThomasMatthews - Your example is another good example of using the HW in the optimal way. But you are wrong when you say that cache isn't a concern. If a 4 byte variable was placed in memory so that 1 byte mapped into 1 cache line and the next 3 bytes mapped to the next cache line, the processor would have 2 cache misses when reading the variable (if it isn't in cache already, of cause) — Support Ukraine, Nov 19 '16 at 08:56
@4386427, I have 1 more question about what you mean by "That typically imply that variables are aligned to their size or at least 4 byte ". So if variables have different size, do we still use a constant length word-boundary (say 4 byte)? And what do you mean by "variables are aligned to their size" in this case? Do you mean that we could have a length of word-boundary that is not fixed? — Anni_housie, Nov 19 '16 at 19:19
@Anni_housie If a variable is 2 byte, it will typically be 2 byte aligned. If a variable is 4 byte, it will typically be 4 byte aligned. If a variable is 8 byte, it will typically be 8 byte aligned on 64 bit systems and 4 byte aligned on 32 bit systems. So alignment depends on the size of the variable. But - just to repeat it - this is very implementation dependent. And also depends on compiler options. The whole idea is to improve performance - some times at the cost of memory. — Support Ukraine, Nov 19 '16 at 19:52
can you please see my update, will we ever encounter something like the table, if we have mix type? — Anni_housie, Nov 20 '16 at 00:59

score 1 · Answer 2 · answered Nov 19 '16 at 08:57

It's quite hard to exactly predict, but there's certainly some padding going on. Take these two codes for example (I run them on Coliru, 64bit machine)

    #include<iostream>
#include <vector>
using namespace std;

//#pragma pack(push,1)
int main(){    
      int num1(5);  // 4byte
      int num2(3);   // 4byte
      char c1[2];
      c1[0]='a';
      c1[1]='a';
      cout << &num1 << " " << &num2 << " "  << endl;     
      cout << sizeof(c1) << " " << &c1 << endl;

}
//#pragma pack(pop)




    #include<iostream>
#include <vector>
using namespace std;

//#pragma pack(push,1)
int main(){    
      int num1(5);  // 4byte
      int num2(3);   // 4byte
      char c1[1];
      c1[0]='a';
      cout << &num1 << " " << &num2 << " "  << endl;     
      cout << sizeof(c1) << " " << &c1 << endl;

}
//#pragma pack(pop)

The first program outputs:

0x7fff3e1f9de8 0x7fff3e1f9dec 
2 0x7fff3e1f9de0

While the second program outputs:

0x7fffdca72538 0x7fffdca7253c 
1 0x7fffdca72537

You can definitely notice that there's a padding being made in the first program, looking at the addresses we can see that: First program: CHAR | CHAR | 6-BYTE PADDING | INT | INT Second program: CHAR | INT | INT

So for the basic question, yes it is probably padding by default. I also tried to use pragma pack to avoid padding, and in contrast to the struct case, I didn't manage to make it avoid padding, since the outputs were exactly the same.

Confuse about data address alignment

2 Answers2