What impact could extra push do to assembly program?

Question

Given:

typedef struct __attribute__((packed)) _Node{
    int data;
    struct _Node *left;
    struct _Node *right;
} Node;

and the following assembly code which searches for a value in a tree. (Same code as How this assembly code will be translated into c?)

.section .text
.global _start

_start:
    mov $8, %esi
    mov $A, %rdi
    call func
    movq $60, %rax
    movq $0, %rdi
    syscall
    
func:
    pushq %rbp
    movq %rsp, %rbp
    cmp (%rdi), %esi
    jne continue
    mov $1, %eax
    jmp finish
    
continue: # go left
    cmpq $0, 4(%rdi)
    je next
    pushq %rdi # 3
    mov 4(%rdi), %rdi
    call func
    pop %rdi # 4
    cmp $1, %eax
    je finish

next: # go right
    cmpq $0, 12(%rdi)
    je fail
    pushq %rdi # 1
    mov 12(%rdi), %rdi
    call func
    pop %rdi # 2
    cmp $1, %eax
    je finish
   
fail:
    mov $0, %rax

finish:
    leave
    ret

I want to know what will be the impact of this change and if it will cause the program to not work as expected:

adding push %rdi right after continue.

From what I understand this will cause a problem since we are pushing some extra value to the stack so the caller for this iteration might pop a wrong value of %rdi for example the caller in this case:

    pushq %rdi # 1
    mov 12(%rdi), %rdi
    call func
    pop %rdi # 2

might pop 12+%rdi instead of poping %rdi, But I ran many tests and all of them seem to return correct value in RAX, why is that?

Note: can this line cause stack overflow too? I think the answer is probably yes.

Peter Cordes · Answer 1 · 2021-05-03T12:14:12.167

4

Note the leave in the epilogue: it restores RSP before ret, undoing any extra pushes. If it wasn't there, extra pushes would make your function crash (e.g. popping a copy of RDI into RIP), not successfully return but with RSP pointing at the wrong place.

might pop 12+%rdi instead of poping %rdi

No, that's not possible. Push happens before RDI is modified so you're pushing the original value, and pop reads it back. call has no net effect on RSP, i.e. RSP is call-preserved. (If you broke your stack, ret would pop the wrong thing instead of a return address, so you'd crash instead of returning at all. So unless you purposely copy your return address somewhere else, you can't return with RSP modified).

Also, the contents of memory at RDI+12 are not the same thing as RDI+12. And if you literally meant pop 12(%rdi) (i.e. pop into memory), no obviously not; pop %rdi always writes the register, regardless of what data is being popped.

Like I just commented on your last question while you were posting this, leave in the epilogue covers up any problems of unbalanced pushes (whether that's from adding an extra push, or from removing a pop) as long as the push/pop operations that actually need to get the right value to the right register are still there. (saving RDI around the first call). Your function doesn't need the node pointer again after the 2nd call, so it's pointless, and would be better to just leave / mov 12(%rdi), %rdi / jmp func. But any extra pushes before that 2nd call don't matter.

The only thing that would be a problem would be an extra pop that removes your return address from the stack before the last call, so it gets overwritten. (An unbalance pop after the last call would leave your return address below RSP, but that's safe in user-space in the x86-64 System V ABI, because it guarantees a red-zone below RSP that's safe from asynchronously getting clobbered by signal handlers and whatnot. So leave would still point RSP back to where it should be, letting ret pop the return address.

Remember, leave is equivalent to mov %rbp, %rsp / pop %rbp, so it resets RSP for itself and later stack operations.

Any extra pushes before a recursive call mean 8 extra bytes of stack space used per depth of stack, but Linux user-space stacks default to 8MiB so it would take a pretty deep tree to get anywhere near actually overflowing it.

Single-step your code in a debugger (such as GDB) to see how it runs.

Use display /x *(long (*)[5])$rsp to dump the top 5 qwords from the stack after every single step (by casting it to pointer-to-array and dereferencing). Then stepi through your code and see how it changes, especially at leave.

edited May 03 '21 at 12:14

answered May 03 '21 at 11:57

Peter Cordes

328,167
45
605
847

"No, that's not possible. Push happens before RDI is modified" I meant it may read the wrong value that was saved from the added new line – May 03 '21 at 11:58
Again you didn't prove my explanation is wrong, when calling pop we will load wrong value into register rdi – May 03 '21 at 12:02
1

@stacker: That's not possible because the push which wrote a value to the stack was done *before* the update to RDI. So the pushed value is the original. So when you pop it, it'll read back the value you pushed. That's kind of the whole point. If you want to see that in action, definitely use that `display /x` command and single-step the code. – Peter Cordes May 03 '21 at 12:02
" the push which wrote a value to the stack was done before the update to RDI" I don't agree, we save rdi then change rdi then we call the function again and do pop right? when we call the function again it may insert wrong value to the stack so when we do pop after it finished we return wrong value – May 03 '21 at 12:04
@stacker: Read the rest of the paragraph where I already explained that it's not possible for the call to return with RSP pointing at something else. And read the bolded part of the answer where it explains that the `leave` instruction is saving your bacon here. You've already proved by running it that it doesn't crash. Use a debugger yourself if you're not going to believe experts on Stack Overflow when they tell you how your code works. – Peter Cordes May 03 '21 at 12:07
I am not saying I don't believe but rather I don't understand, you saiid "leave instruction is saving your bacon here" but why? all of what leav is doing pop value and insert into register... – May 03 '21 at 12:10
@stacker: No, that's *not* all it's doing. The `mov %rbp, %rsp` part of `leave` that happens before the `pop %rbp` part is the entire point. If you're having trouble picturing it in your head, load the program in GDB, use that `display /x` command, then `starti` and repeat `stepi` to see what's on the stack after every instruction. – Peter Cordes May 03 '21 at 12:13
How can I use display /x online? I don't have or now to use gcc – May 03 '21 at 12:15
last thing, according to what you said this may cause stackoverflow but again leav will prevent this from happening (if stackoverflow is about to happen it's not related to the new added line at all since leave is restoring the stack to where it was before) – May 03 '21 at 12:17
1

@stacker: Pick any other debugger if you like, and watch memory in it. There is https://www.onlinegdb.com/ if for some reason you can't instead GDB on your own GNU/Linux setup, it supports x86-64 GAS assembly. Having a debugger is *extremely* helpful for learning assembly, not using one is basically a waste of your own and everyone else's time. It's like trying to build a robot and/or learn how it works blindfolded, by sound instead sight. – Peter Cordes May 03 '21 at 12:17

What impact could extra push do to assembly program?

1 Answers1

Linked