I was digging into the assembly of code that was using extended alignment for a stack-based variable. This is a smaller version of the code
struct Something {
Something();
};
void foo(Something*);
void bar() {
alignas(128) Something something;
foo(&something);
}
This, when compiled with clang 8.0 generates the following code (https://godbolt.org/z/lf8WW-)
bar(): # @bar()
push rbp
mov rbp, rsp
and rsp, -128
sub rsp, 128
mov rdi, rsp
call Something::Something() [complete object constructor]
mov rdi, rsp
call foo(Something*)
mov rsp, rbp
pop rbp
ret
And earlier versions of gcc produce the following (https://godbolt.org/z/LLQ8gW). Starting gcc 8.1, both produce the same code
bar():
lea r10, [rsp+8]
and rsp, -128
push QWORD PTR [r10-8]
push rbp
mov rbp, rsp
push r10
sub rsp, 232
lea rax, [rbp-240]
mov rdi, rax
call Something::Something() [complete object constructor]
lea rax, [rbp-240]
mov rdi, rax
call foo(Something*)
nop
add rsp, 232
pop r10
pop rbp
lea rsp, [r10-8]
ret
I'm not too familiar with x86 and just out of curiosity - what exactly is happening here in both pieces of code? Does the compiler pull tricks like std::align() and round up the current stack position to a multiple of 128 for the on-stack variable something
?