How can I modify the stack with nasm, x86_64, linux functions (using `ret` keyword)?

Question

TL;DR

How can I modify the stack while using ret or achieving similar effect while using something else?

Hello world,

I am trying to make a compiler for my language, currently everything is inlined and it makes the compilation slow for some steps so today I decided to try to optimise it using functions, though it keeps segfaulting, then I realised

This seems to not work:

;; main.s

BITS 64
segment .text

global _start

exit:
    mov rax, 60  ;; Linux syscall number for exit
    pop rdi      ;; Exit code
    syscall
    ret

write:
    mov rax, 1  ;; Linux syscall number for write
    mov rdi, 1  ;; File descriptor (1 = stdout)
    pop rsi     ;; Pointer to string
    pop rdx     ;; String length
    syscall
    ret

_start:
    mov rax, msg_len
    push rax

    mov rax, msg
    push rax

    call write

    mov rax, 0
    push rax

    call exit


segment .data

msg: db "Hello, world!", 10
msg_len: equ $-msg

My output for this is.... questionable:

$ nasm -felf64 main.s
$ ld -o main main.s
$ ./main
PHello, world!
@       @ @$@ @+ @2 @main.sexitwritemsgmsg_len__bss_start_edata_end.symtab.strtab.shstrtab.text.data9! @  !77!'Segmentation fault

$? (exit code) is 139 (segfault)

While all inlined all works:

;; main1.s

BITS 64
segment .text

global _start

_start:
    mov rax, msg_len
    push rax

    mov rax, msg
    push rax

    mov rax, 1  ;; Linux syscall number for write
    mov rdi, 1  ;; File descriptor (1 = stdout)
    pop rsi     ;; Pointer to string
    pop rdx     ;; String length
    syscall

    mov rax, 0
    push rax

    mov rax, 60  ;; Linux syscall number for exit
    pop rdi      ;; Exit code
    syscall

segment .data

msg: db "Hello, world!", 10
msg_len: equ $-msg

My output is completely normal:

$ nasm -felf64 main1.s
$ ld -o main1 main1.o
$ ./main1
Hello, world!

$? (exit code) is 0 (as specified in assembly, meaning success)

So now I'm here confused as I am a newbie at assembly what to do, even though I found related solutions like

NASM push before ret

I am still confused how to take that in... Is there a way I can do it or am I stuck with inlining? Should I maybe switch assemblers all together from nasm to something else?

Thanks in advance

Recall that the `call` instruction pushes the return address on the stack. So your `pop rax` instruction pops off the return address instead of the argument you pushed. — fuz, Feb 26 '22 at 20:27
You are completely messing up the stack. Call pushes the return address. In the first example, you pop in rdi in the functions but you are really just popping the return address. — user123, Feb 26 '22 at 20:29
Why do you even push anything on the stack like you shouldn't mov in rdi directly? Just mov msg in rdi. — user123, Feb 26 '22 at 20:30
@user123 I like your username lol, but does that mean I just do more `pop`s and that's it? or `mov`e them somewhere else or... — Ari157, Feb 26 '22 at 20:31
@user123 The programming language I am transpilng down to assembly is stack based so makes sense to me — Ari157, Feb 26 '22 at 20:31
@user123 but how else should I do it without simulating the stack? — Ari157, Feb 26 '22 at 20:34
I don't really understand what you are trying to do. What is the difference between mov rax msg and then pushing on the stack than just mov rsi msg? — user123, Feb 26 '22 at 20:38
2 `pop`s did not work, though now it's only a segfault without the extra garbage (https://pastebin.com/DADW6mmK) — Ari157, Feb 26 '22 at 20:39
@user123 mov rax, .. requires the register to be changed, like mov rax, 1; mov rdi, 2 and so on while using the stack does not require that — Ari157, Feb 26 '22 at 20:41
If your programming language is "stack based" try to learn how the stack is actually used with modern compilers. See: https://stackoverflow.com/questions/69623703/each-program-allocates-a-fixed-stack-size-who-defines-the-amount-of-stack-memor/69633252#69633252 — user123, Feb 26 '22 at 20:42
@Ari157 That would pop off the return address, too. Instead, use a memory operand like `[esp+4]` to access the argument and pop it off after the function has returned. — fuz, Feb 26 '22 at 20:44
@user123 Okay, I'll see if I can make it work, I went the more fun route, but I'll update you if I figure it out — Ari157, Feb 26 '22 at 20:45
Without some kind of stack frame or whatever it is just a game of how much you pushed vs how much you popped before calling your functions. It isn't really complex to "get". The stack isn't a complex structure after all. — user123, Feb 26 '22 at 20:45
Oh I see, it's just a bunch of minus/plus operations and moving rbp and rdp registers around, I went for obvious pop/push and stuff, will see if I can use this knowledge manually now — Ari157, Feb 26 '22 at 20:52
@Ari157 No, I mean you can do `mov [rsp+8], rax` to load the argument from the stack into `rax` without having to pop off the return address. — fuz, Feb 26 '22 at 20:59
If you follow the 3 steps to create a stack frame. Things work but you need to calculate the negative offsets from rbp (where you placed your local vars). Normally, it is actually registers which are used to pass arguments before using the stack if they exceed a certain amount. Passing arguments on the stack I guess is ok but you need to determine where from rbp they are (if you are actually using a stack frame). I don't know how it is done by gcc normally. Must be quite complex. — user123, Feb 26 '22 at 21:01
@user123 Yep, got that, though still not able to wrap my head around that, I'll try fuz's answer first — Ari157, Feb 26 '22 at 21:08
The msg is in the data segment. It means that it is actually global to your program. This means that the string in a language like C would be outside any function (brackets). It would be accessible to the write function without having to "pass" it via the stack. The assembly shown here isn't reprensative in any way of any sort of assembly you'd find normally. What you pass as arguments normally is actually local variables which aren't accessible in the other function's scope. — user123, Feb 26 '22 at 21:17
In this case, a normal compiler would keep msg as a symbol in the executable and let the linker resolve an offset from RIP for accessing it in the write function. — user123, Feb 26 '22 at 21:18
Since you make it global in your assembly, you can just let nasm leave a nice symbol for ld so that it can resolve where to reach it. Why bother passing it on the stack if it is global? — user123, Feb 26 '22 at 21:25
Terminolgoy: `ret` isn't a NASM keyword, it's just an *instruction mnemonic*. For example, you can define a label with that name like `ret: db "hi mom",0`. It's only an instruction if you use it in a context where NASM will look for an instruction. (Compared to languages like C, assembly language grammar has context by position relative to other things on a line. So it doesn't really have keywords in the same sense that C has things like `int`.) — Peter Cordes, Feb 27 '22 at 01:53
If you're compiling a stack-based language, maybe something like Forth, you probably *don't* want to use "the stack" (addressed by rsp) for your data stack. As you see, it is hard to make it inter-operate with function calls. Normally you'd allocate a large block of memory somewhere else, and use some other register (let's say `r8`) as a stack pointer into that block. It does mean that you can't just use the convenient `push/pop` instructions; you'll have to say `sub r8, 8 ; mov [r8], rax` instead. — Nate Eldredge, Feb 27 '22 at 06:39

score 1 · Accepted Answer · answered Mar 11 '22 at 16:34

tl;dr

Remember that call is technically a push rip, and ret is technically a pop rip, so you pretty much messed up your stack in your example because you inadvertently pop it in the wrong spot.

More of an answer

Although you should probably properly learn how calling conventions work, I'm going to attempt an answer to briefly "soften" the idea, and for the fun of learning.

Abstractly speaking, in order to have functions, you must have something called stack frames, or else you'd have a pretty hard time managing local variables and getting ret to work. On x86_64, a stack frame is pretty much composed of a few things, in order.

The function arguments, if there are any⁰,
- If some arguments were passed in registers, this may be omitted.
the return address,
- The call instruction will push this onto the stack.
- It's on you to make sure the ret instruction will pop this off the stack.
optionally a frame pointer,
- If your stack grows by a dynamic amount, this can keep track of the start of the frame.
- Otherwise, if you know the stack size ahead of time, it's optional.
and then your local state on the stack.

As long as execution stays within your little assembly space, you are technically free to pass arguments however you want¹ as long as you are aware of how instructions like call and ret manipulate the stack. The simplest way, in my opinion, is to make it sort of stack-based, so that your compiler would not need to worry about register allocation as much².

To keep things simple, I'd suggest using something like the x86 convention but applied to x86_64, as you seem to be using 64-bit code. That is to say, the caller function would push all of its arguments onto the stack (usually in reverse order), and then call the callee function. For example, for a 3-argument function, your stack would end up looking something like this (beware that the top of the stack is actually on the bottom).

+----------------+
| argument 2     |
+----------------+
| argument 1     |
+----------------+
| argument 0     |
+----------------+
| return address |
+----------------+
| local state    |
| ...            |
+----------------+

Also, I noticed that you never really made use of the rsp register. Depending on the design of your compiler, you technically could get away with this. Stack machines like the JVM rely solely on pushes and pops, anyway, I believe. As long as your pushes and pops match (especially call and ret, which act as a special push and pop), you should be fine.

⁰ Windows actually allocates at least an extra 32 bytes here for argument spilling, but you can probably ignore that in this case.

¹ There are specific calling conventions that dictate how parameters are passed from caller to callee and back. Beyond your programming exercise, I highly recommend reading about how they work, so that your compiler can output code that can easily be called by and easily call functions that weren't emitted by your compiler, or go the Forth way as Nate mentioned.

² goto 1

How can I modify the stack with nasm, x86_64, linux functions (using `ret` keyword)?

1 Answers1

tl;dr

More of an answer