Inappropriate behaviour while printing an ASCII character from a number taken from the stack in x86 assembly

Question

I am trying to understand how to properly work with stack in x86-64 assembly in Linux. I am pushing some numbers to the stack then I want to take the top number in the stack and print the appropriate ASCII value. But it prints different stuff every time. Please help me understand why the code below behaves that way.

.section .bss
.comm x, 4 

.section .text
.globl _start

_start:
    push $52
    push $65

    mov %rsp, (x)

    mov $1,                 %rax
    mov $1,                 %rdi
    mov $x,                 %rsi
    mov $1,                 %rdx
    syscall

    mov $60,  %rax             
    movb $0,  %dil
    syscall

Also, I would appreciate it if you gave me some sources where I can learn to work with the stack in assembly.

`mov %rsp, (x)` - rsp is a pointer to the current stack location. So what this says is "Move the 8 byte stack pointer into the 4 bytes representing the variable `x`." Almost certain that wasn't your intent. — David Wohlferd, Dec 12 '21 at 23:33

Marco Bonelli · Accepted Answer · 2021-12-12T23:50:15.913

The instruction mov %rsp, (x) writes the value of RSP (current stack pointer) to the location x in memory, which is in your .bss section. Then, you are doing mov $x, %rsi: this moves the address of x into RSI, so RSI will point to your .bss variable, which holds the value of the stack pointer. When you try and issue a write syscall reading from the memory pointed by RSI, your output will be the least significant byte of RSP that you saved there. Since the position of the stack can vary every single execution of your program, this is why you are getting a different output every time.

Also, you are reserving space in .bss with .comm x, 4, but you are moving RSP into it with mov %rsp, (x), which is an 8-byte move.

What you really want to do to print values from the stack to standard output is simply mov %rsp, %rsi. You don't need x at all:

.section .text
.globl _start

_start:
    push $65

    mov $1, %rax
    mov $1, %rdi
    mov %rsp, %rsi
    mov $1, %rdx
    syscall

    mov $60, %rax
    mov $0, %rdi
    syscall

The above code should output A and exit.

If you want to use an intermediate variable in the .bss, you will have to perform two moves to copy a value from the stack into it using an intermediate scratch register (since mov can only take one memory operand at a time):

.section .bss
.comm x, 4

.section .text
.globl _start

_start:
    push $65

    movb (%rsp), %al
    movb %al, (x)

    mov $1, %rax
    mov $1, %rdi
    mov $x, %rsi
    mov $1, %rdx
    syscall

    mov $60, %rax
    mov $0, %rdi
    syscall

Also, note that on Linux exit takes an int as parameter which is 4 bytes on x86, so movb $0, %dil won't suffice in general. In your case it's ok since you previously set RDI to 1 and RDI is preserved on syscall.

In general, you can use xor %edi, %edi (see What is the best way to set a register to zero in x86 assembly: xor, mov or and?) to efficiently zero-out a 64-bit register. Similarly, you can do movl $value, %eax instead of mov $value, %rax if your $value is 32bit or less, avoiding an unneeded instruction prefix in the generated code while maintaining the same beahvior.

Doesn't the same logic that allows you to xor edi to zero rdi mean you can do `mov $1, %eax`? — David Wohlferd, Dec 12 '21 at 23:45
@DavidWohlferd yes, though I was just pointing out a possible mistake since `mov $0, %dil` is not in general correct and only sets the LSB. You could change almost every move in the above program with a 4-byte move if you wanted, which is in general a good idea since if you don't need the entire 8-byte move you are just wasting 1 byte of instruction prefix doing it. — Marco Bonelli, Dec 12 '21 at 23:46
It's true that `_exit(int)` does take a full 32-bit integer, but most ways of obtaining the exit status of a process can only see the low 8 bits. Presumably this was what motivated using an 8-bit partial-register write. (But there are ways to get the full 32-bit exit status (other than strace), although Linux may still truncate to 8-bit even with the newer APIs: [What is the min and max values of exit codes in Linux?](https://unix.stackexchange.com/q/418784). An answer on [ExitCodes bigger than 255, possible?](https://stackoverflow.com/q/179565) tested on MacOS, not Linux) — Peter Cordes, Dec 13 '21 at 00:01
@PeterCordes I suspect OP wrote `movb` just because RDI was already set to `1` before that. — Marco Bonelli, Dec 13 '21 at 00:03
Yeah possibly; in this case it does call exit with EDI=0, and `movb $0, %dil` is shorter than `mov $0, %edi`. (But still worse than `xor`-zeroing as you point out.) But IDK, if they were optimizing at all, they might not be using 64-bit mov to set the `int fd` EDI=1. — Peter Cordes, Dec 13 '21 at 00:04
@PeterCordes yeah... good point. Guess we'll have to see what OP says as my crystal ball is currently broken :') — Marco Bonelli, Dec 13 '21 at 00:08

Inappropriate behaviour while printing an ASCII character from a number taken from the stack in x86 assembly

1 Answers1

Related