10

When compiling below code:

global main
extern printf, scanf

section .data
   msg: db "Enter a number: ",10,0
   format:db "%d",0

section .bss
   number resb 4

section .text
main:
   mov rdi, msg
   mov al, 0
   call printf

   mov rsi, number
   mov rdi, format
   mov al, 0
   call scanf

   mov rdi,format
   mov rsi,[number]
   inc rsi
   mov rax,0
   call printf 

   ret

using:

nasm -f elf64 example.asm -o example.o
gcc -no-pie -m64 example.o -o example

and then run

./example

it runs, print: enter a number: but then crashes and prints: Segmentation fault (core dumped)

So printf works fine but scanf not. What am I doing wrong with scanf so?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
user3057544
  • 807
  • 9
  • 22
  • Did you run it in a debugger like gdb to let it tell you what instruction that caused the fault? – Michael Petch Jun 27 '18 at 20:15
  • 3
    You have a potential issue with the stack not being aligned to a 16-byte boundary before calling the _C_ library routines (per the AMD64 ABI). `mov rsi,[number]` moves 8 bytes from number to RSI, when `number` is 4 bytes. Maybe you wanted `mov esi,[number]` – Michael Petch Jun 27 '18 at 20:22
  • @Michael Petch - no, I am newbie to linux and assembler and gdb frighten me. But below code is an example from here: https://stackoverflow.com/questions/26889692/nasm-x86-64-scanf-segmentation-fault and is marked as solution, so should works – user3057544 Jun 27 '18 at 20:23
  • 2
    Just because an answer is accepted doesn't mean it doesn't have problems. What happens if you add `push rbp` right after `main:` and just before `ret` add `pop rbp` (this should deal with the alignment issue). Change `mov rsi,[number]` to `mov esi,[number]` . GDB or any debugger is the ideal tool. If a debugger scares you, assembly is probably not what you should be programming in. – Michael Petch Jun 27 '18 at 20:26
  • 1
    push rbp and pop rbp resolves problem. Thanks a lot! – user3057544 Jun 27 '18 at 20:29
  • 1
    Ubuntu 18.04 (Bionic Beaver) 64bit – user3057544 Jun 27 '18 at 20:42
  • 1
    Isn't the edb debugger is now available in the universe repositories of 18.04? Check it out, it has probably somewhat simpler UI, although it's not as powerful and versatile as gdb. If you are planning just to learn assembly basics with short pieces of code, edb should be sufficient for that. (if it's not in repos yet, you would have to compile it from source [from github], but that's not completely trivial either). – Ped7g Jun 27 '18 at 21:13

1 Answers1

15

Use sub rsp, 8 / add rsp, 8 at the start/end of your function to re-align the stack to 16 bytes before your function does a call.

Or better push/pop a dummy register, e.g. push rdx / pop rcx, or a call-preserved register like RBP you actually wanted to save anyway. You need the total change to RSP to be an odd multiple of 8 counting all pushes and sub rsp, from function entry to any call.
i.e. 8 + 16*n bytes for whole number n.

On function entry, RSP is 8 bytes away from 16-byte alignment because the call pushed an 8-byte return address. See Printing floating point numbers from x86-64 seems to require %rbp to be saved, main and stack alignment, and Calling printf in x86_64 using GNU assembler. This is an ABI requirement which you used to be able to get away with violating when there weren't any FP args for printf. But not any more.

See also Why does the x86-64 / AMD64 System V ABI mandate a 16 byte stack alignment?

To put it another way, RSP % 16 == 8 on function entry, and you need to ensure RSP % 16 == 0 before you call a function. How you do this doesn't matter. (Not all functions will actually crash if you don't, but the ABI does require/guarantee it.)


gcc's code-gen for glibc scanf now depends on 16-byte stack alignment
even when AL == 0
.

It seems to have auto-vectorized copying 16 bytes somewhere in __GI__IO_vfscanf, which regular scanf calls after spilling its register args to the stack1. (The many similar ways to call scanf share one big implementation as a back end to the various libc entry points like scanf, fscanf, etc.)

I downloaded Ubuntu 18.04's libc6 binary package: https://packages.ubuntu.com/bionic/amd64/libc6/download and extracted the files (with 7z x blah.deb and tar xf data.tar, because 7z knows how to extract a lot of file formats).

I can repro your bug with LD_LIBRARY_PATH=/tmp/bionic-libc/lib/x86_64-linux-gnu ./bad-printf, and also it turns out with the system glibc 2.27-3 on my Arch Linux desktop.

With GDB, I ran it on your program and did set env LD_LIBRARY_PATH /tmp/bionic-libc/lib/x86_64-linux-gnu then run. With layout reg, the disassembly window looks like this at the point where it received SIGSEGV:

   │0x7ffff786b49a <_IO_vfscanf+602>        cmp    r12b,0x25                                                                                             │
   │0x7ffff786b49e <_IO_vfscanf+606>        jne    0x7ffff786b3ff <_IO_vfscanf+447>                                                                      │
   │0x7ffff786b4a4 <_IO_vfscanf+612>        mov    rax,QWORD PTR [rbp-0x460]                                                                             │
   │0x7ffff786b4ab <_IO_vfscanf+619>        add    rax,QWORD PTR [rbp-0x458]                                                                             │
   │0x7ffff786b4b2 <_IO_vfscanf+626>        movq   xmm0,QWORD PTR [rbp-0x460]                                                                            │
   │0x7ffff786b4ba <_IO_vfscanf+634>        mov    DWORD PTR [rbp-0x678],0x0                                                                             │
   │0x7ffff786b4c4 <_IO_vfscanf+644>        mov    QWORD PTR [rbp-0x608],rax                                                                             │
   │0x7ffff786b4cb <_IO_vfscanf+651>        movzx  eax,BYTE PTR [rbx+0x1]                                                                                │
   │0x7ffff786b4cf <_IO_vfscanf+655>        movhps xmm0,QWORD PTR [rbp-0x608]                                                                            │
  >│0x7ffff786b4d6 <_IO_vfscanf+662>        movaps XMMWORD PTR [rbp-0x470],xmm0                                                                          │

So it copied two 8-byte objects to the stack with movq + movhps to load and movaps to store. But with the stack misaligned, movaps [rbp-0x470],xmm0 faults.

I didn't grab a debug build to find out exactly which part of the C source turned into this, but the function is written in C and compiled by GCC with optimization enabled. GCC has always been allowed to do this, but only recently did it get smart enough to take better advantage of SSE2 this way.


Footnote 1: printf / scanf with AL != 0 has always required 16-byte alignment because gcc's code-gen for variadic functions uses test al,al / je to spill the full 16-byte XMM regs xmm0..7 with aligned stores in that case. __m128i can be an argument to a variadic function, not just double, and gcc doesn't check whether the function ever actually reads any 16-byte FP args.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • Very interesting. openSuSE has no problem without alignment (gcc 4.8.5), but Arch indeed SegFaults (gcc 8.1.1). Insuring 16-byte alignment works fine. – David C. Rankin Jun 28 '18 at 00:01
  • @DavidC.Rankin: It was only very recently that this changed on Arch. – Peter Cordes Jun 28 '18 at 00:32
  • PS: `jmp scanf` to tail-call (like call scanf/ret) of course requires that RSP%16==8 instead of being aligned, to replicate the expected function-entry layout. And as always, a jmp tailcall only works when RSP is pointing at your own return address, so scanf will get that return address. So you can only tail-call it from a function that was itself called with 16-byte RSP alignment. – Peter Cordes Jan 19 '21 at 22:39