gcc x86-32 stack alignment and calling printf

Question

To the best of my knowledge, x86-64 requires the stack to be 16-byte aligned before a call, while gcc with -m32 doesn't require this for main.

I have the following testing code:

.data
intfmt:         .string "int: %d\n"
testint:        .int    20

.text
.globl main

main:
    mov     %esp, %ebp
    push    testint
    push    $intfmt
    call    printf
    mov     %ebp, %esp
    ret

Build with as --32 test.S -o test.o && gcc -m32 test.o -o test. I am aware that syscall write exists, but to my knowledge it cannot print ints and floats the way printf can.

After entering main, a 4 byte return address is on the stack. Then interpreting this code naively, the two push calls each put 4 bytes on the stack, so call needs another 4 byte value pushed to be aligned.

Here is the objdump of the binary generated by gas and gcc:

0000053d <main>:
 53d:   89 e5                   mov    %esp,%ebp
 53f:   ff 35 1d 20 00 00       pushl  0x201d
 545:   68 14 20 00 00          push   $0x2014
 54a:   e8 fc ff ff ff          call   54b <main+0xe>
 54f:   89 ec                   mov    %ebp,%esp
 551:   c3                      ret    
 552:   66 90                   xchg   %ax,%ax
 554:   66 90                   xchg   %ax,%ax
 556:   66 90                   xchg   %ax,%ax
 558:   66 90                   xchg   %ax,%ax
 55a:   66 90                   xchg   %ax,%ax
 55c:   66 90                   xchg   %ax,%ax
 55e:   66 90                   xchg   %ax,%ax

I am very confused about the push instructions generated.

If two 4 byte values are pushed, how is alignment achieved?
Why is 0x2014 pushed instead of 0x14? What is 0x201d?
What does call 54b even achieve? Output of hd matches objdump. Why is this different in gdb? Is this the dynamic linker?

B+>│0x5655553d <main>                       mov    %esp,%ebp                      │
   │0x5655553f <main+2>                     pushl  0x5655701d                     │
   │0x56555545 <main+8>                     push   $0x56557014                    │
   │0x5655554a <main+13>                    call   0xf7e222d0 <printf>            │
   │0x5655554f <main+18>                    mov    %ebp,%esp                      │
   │0x56555551 <main+20>                    ret

Resources on what goes on when a binary is actually executed are appreciated, since I don't know what's actually going on and the tutorials I've read don't cover it. I'm in the process of reading through How programs get run: ELF binaries.

`mov %esp, %ebp` without saving / restoring the caller's `%ebp` is bad and could easily lead to a segfault after main returns. — Peter Cordes, Sep 12 '18 at 04:04
Compile your C code with `gcc -O1 -fverbose-asm -S` to get assembler code. Read the relevant [x86 ABI](https://github.com/hjl-tools/x86-psABI/wiki/X86-psABI) specification — Basile Starynkevitch, Sep 12 '18 at 04:16
@BasileStarynkevitch I am hand writing assembly for learning purposes. I want to preserve as much of my original assembly as possible which I'm afraid `-O1` will get rid of. — qwr, Sep 12 '18 at 04:23
Then remove `-O1`. Notice that the assembler is *generated* by `gcc` from some `foo.c` source code into `foo.s` after `gcc -fverbose-asm -S foo.c` (and you could add `-O1`). I did mention C code (not assembler) to understand what kind of assembler code `gcc` is generating — Basile Starynkevitch, Sep 12 '18 at 04:27
@BasileStarynkevitch I'm still not sure I understand. I am using `as` to create object file and `gcc` to create executable, no C source at all? — qwr, Sep 12 '18 at 04:28
I was suggesting to compile some simple C code (like mentioned in the middle of your question) into assembler code and to look into the *generated* assembler. BTW you don't really need `gcc` on pure assembly (you could do direct [syscalls(2)](http://man7.org/linux/man-pages/man2/syscalls.2.html) without using `printf`... and use `ld` to get an ELF executable). Read also [Linux Assembly HowTo](http://www.tldp.org/HOWTO/Assembly-HOWTO/) — Basile Starynkevitch, Sep 12 '18 at 04:29
Please provide some [MCVE] in your question. So show all the C code you got (if you have some), all the assembler code, and the building commands (probably with `as`, `ld`, maybe `gcc`). Give us enough information to reproduce your case on *our* computer. See also [OSDEV](https://www.osdev.org/). BTW, why do you need `printf` ? — Basile Starynkevitch, Sep 12 '18 at 04:33
The stack alignment requirement is an ABI *convention* tied to *your* [calling convention](https://en.wikipedia.org/wiki/Calling_convention). It is *not* required by the processor x86-64 ISA (you could even call functions *without* passing arguments on the stack, but that don't follow usual *conventions*) — Basile Starynkevitch, Sep 12 '18 at 04:44
@BasileStarynkevitch I have added my my build commands and the reasoning for using printf. — qwr, Sep 12 '18 at 05:22
Have you carefully read all the resources I have linked to? They should provide an answer to your question. I still recommend compiling a tiny C program `foo.c` similar to your assembler program (with `gcc -S -fverbose-asm -O1 foo.c`) and looking into the generated assembler code `foo.s`. It should teach you useful things. — Basile Starynkevitch, Sep 12 '18 at 05:32
What you need to understand deeply is your ABI. I gave enough links and advices to understand it, but you do have several hours of reading — Basile Starynkevitch, Sep 12 '18 at 05:43
@qwr: The version of the i386 System V ABI used on Linux does require / guarantee 16-byte stack alignment before a `call`, just like the x86-64 System V ABI. Could the phrasing in my answer you linked be improved to make that more clear? `printf` is allowed to crash if it wants to if called with a misaligned stack like you're doing in this hand-written asm. (But probably won't on systems where libc is compiled without SSE). — Peter Cordes, Sep 12 '18 at 15:43
@PeterCordes I suppose my question #1 is really "why doesn't `printf` crash" if gcc doesn't do alignment — qwr, Sep 12 '18 at 16:28
Because it doesn't do any 16-byte copies to / from the stack using `movaps` or `movdqa`. Most functions in practice *don't* depend on the ABI-guaranteed alignment. `scanf` does in recent x86-64 glibc, but it didn't used to. [scanf Segmentation faults when called from a function that doesn't change RSP](https://stackoverflow.com/q/51070716). Like I said, I don't think Ubuntu compiles their i386 glibc with SSE2 at all, and 3 pointers are only 12 bytes (not 24), so gcc probably wouldn't use a 16-byte copy anyway. — Peter Cordes, Sep 12 '18 at 16:32
(Any function with a `double` on the stack will avoid cache-line splits for it, though, because they can give it 8-byte alignment relative to a known 16-byte alignment. So certainly functions can take advantage of it, but will only crash from misalignment if they use SSE. I guess you could imagine some kind of pointer-alignment calculation with AND that assumes alignment to start with, and would produce wrong behaviour with misalignment...) — Peter Cordes, Sep 12 '18 at 16:36
@PeterCordes ok can you post this as an answer? I have never heard of when functions do and don't care about alignment and I have never written SSE. Of course it is good to follow the ABI. — qwr, Sep 12 '18 at 16:38

Peter Cordes · Accepted Answer · 2018-09-12T17:20:54.493

The i386 System V ABI does guarantee / require 16 byte stack alignment before a call, like I said at the top of my answer that you linked. (Unless you're calling a private helper function, in which case you can make up your own rules for alignment, arg-passing, and which registers are clobbered for that function.)

Functions are allowed to crash or misbehave if you violate this ABI requirement, but are not required to. e.g. scanf in x86-64 Ubuntu glibc (as compiled by recent gcc) only recently started doing that: scanf Segmentation faults when called from a function that doesn't change RSP

Functions can depend on stack alignment for performance (to align a double or array of doubles to avoid cache-line splits when accessing them).

Usually the only case where a function depends on stack alignment for correctness is when compiled to use SSE/SSE2, so it can use 16-byte alignment-required loads/stores to copy a struct or array (movaps or movdqa), or to actually auto-vectorize a loop over a local array.

I think Ubuntu doesn't compile their 32-bit libraries with SSE (except functions like memcpy that use runtime dispatching), so they can still work on ancient CPUs like Pentium II. Multiarch libraries on an x86-64 system should assume SSE2, but with 4-byte pointers it's less likely that 32-bit functions would have 16 byte structs to copy.

Anyway, whatever the reason, obviously printf in your 32-bit build of glibc doesn't actually depend on 16-byte stack alignment for correctness, so it doesn't fault even when you misalign the stack.

Why is 0x2014 pushed instead of 0x14? What is 0x201d?

0x14 (decimal 20) is the value in memory at that location. It will be loaded at runtime, because you used push r/m32, not push $20 (or an assemble time constant like .equ testint, 20 or testint = 20).

You used gcc -m32 to make a PIE (Position Independent Executable), which is relocated at runtime, because that's the default on Ubuntu's gcc.

0x2014 is the offset relative to the start of the file. If you disassemble at runtime after running the program, you'll see a real address.

Same for call 54b. It's presuambly a call to the PLT (which is near the start of the file / text segment, hence the low address).

If you disassembled with objdump -drwC, you'd see symbol relocation info. (I like -Mintel as well, but beware it's MASM-like, not NASM).

You can link with gcc -m32 -no-pie to make classic position-dependent executables. I'd definitely recommend that especially for 32-bit code, and especially if you're compiling C, use gcc -m32 -no-pie -fno-pie to get non-PIE code-gen as well as linking into a non-PIE executable. (see 32-bit absolute addresses no longer allowed in x86-64 Linux? for more about PIEs.)

And is 0x2014 and 0x201d values that are replaced during dynamic linking? I am using default Ubuntu 18.04 and whatever `printf` came with that. — qwr, Sep 12 '18 at 17:07
@qwr: yes, you made a PIE executable because that's the gcc default on recent Ubuntu. — Peter Cordes, Sep 12 '18 at 17:21
Thank you for your reference to PIE, that is what I was looking for. — qwr, Sep 12 '18 at 17:45

gcc x86-32 stack alignment and calling printf

1 Answers1

Linked

Related