-1

I have been trying to understand how compiler aligns stack variables on 64/32bit machines.

Have a look at the code below:

void Test()
{
    int x = 1;
    int y = 2;
    int z = 3;
}

I found that:

  • &x > &y > &z is true.
  • &x - &y and &y - &z is 3 for both of them.

This means that each variable is padded by 8 additional bytes. For a 32bit application on a 64bit machine I was expecting the variables do not need any alignment i.e., the difference should have been 1, not 3.

Can someone explain this? Thanks!

Environment details:

64bit Windows 7, Visual Studio 2010, Application Configuration: x86

JKC
  • 109
  • 1
  • 10
  • 2
    It is a pure implementation detail. In this case you see the side-effects of the [/RTC compile option](https://msdn.microsoft.com/en-us/library/8wtf2dfz.aspx). – Hans Passant May 11 '17 at 22:42
  • 1
    Note that the memory locations are purely implementation dependent based on the calculator. If you do &y - &x the compiler will not do pure pointer arithmetic. It will know that they are int pointers. You would likely get different answers if you cast the addresses to `void *`, and then did the subtraction. – bruceg May 11 '17 at 22:45
  • I don't think pointer arithmetic is even possible for void*. You should reinterpret the pointer as unsigned int first, then subtract. – Liran Funaro May 11 '17 at 22:52
  • 3
    In MSVC, as I make various edits and tests, the order of and address difference of `x`, `y` and `z` move around. Comparing pointers to unconnected objects is *undefined behaviour*. – Weather Vane May 11 '17 at 22:53
  • 1
    @LiranFunaro yes, arithmetic on `void *` is a gcc extension. so better to cast to a `byte *` or something similar to get the number of bytes between the variables. But, it's still an implementation specific detail. Also, depending on what you do, the compiler may optimize the variable location off the stack and into a register. – bruceg May 11 '17 at 23:10
  • @bruceg reading an object address will force the compiler to assign an address to it. So no need to worry about the register thing. – Liran Funaro May 11 '17 at 23:14
  • 1
    @LiranFunaro you can't rely on two adjacently declared variables being adjacent in memory. Because one of them may be optimized out. – bruceg May 12 '17 at 00:15
  • 2
    The layout of local variables on the stack (even without optimization enabled) is arbitrary - see this SO post for a story about how one MSVC compiler version changed the layout of some locals simply because the *name* of a variable changed: http://stackoverflow.com/a/4577565/12711 – Michael Burr May 12 '17 at 01:39

1 Answers1

3

As @HansPassant and others noted, variable alignment on the stack is an implementation detail. You can't even be sure that any of them will be on the stack once optimization is enabled (unless you take an address of it) - each can be optimized to a register.

Just to illustrate: the test program

#include <stdio.h>
#include <windows.h>

int main() {
    int x = 1, y = 2, z = 3;
    printf("%d,%d,%d,%d\n",&y-&x,&z-&y,(BYTE*)&y-(BYTE*)&x,(BYTE*)&z-(BYTE*)&y);
}

prints here (win7 x64, vs2012 x86 compiler):

>cl /nologo t.c >nul && t.exe
2,-1,8,-4

>cl /Ox /nologo t.c >nul && t.exe
-2,1,-8,4

>cl /RTC1 /Ox /nologo t.c >nul && t.exe
-3,-3,-12,-12

In the last case, /RTC inserts gaps between variables to detect out-of-bounds access.


Regarding the default alignment, compilers adhere to Intel's recommendations that owe to the fact that x86 suffers a penalty while accessing unaligned data. Here, specifically, it's Intel® 64 and IA-32 Architectures Optimization Reference Manual, section 3.6.7 Stack Alignment:

Performance penalty of unaligned access to the stack happens when a memory reference splits a cache line. This means that one out of eight spatially consecutive unaligned quadword accesses is always penalized, similarly for one out of 4 consecutive, non-aligned double-quadword accesses, etc.

Aligning the stack may be beneficial any time there are data objects that exceed the default stack alignment of the system. For example, on 32/64bit Linux, and 64bit Windows, the default stack alignment is 16 bytes, while 32bit Windows is 4 bytes.

Assembly/Compiler Coding Rule 55. (H impact, M generality) Make sure that the stack is aligned at the largest multi-byte granular data type boundary matching the register width.

For longer types (floating-point), even higher alignments are recommended for the same purpose.

Community
  • 1
  • 1
ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152