-1

I've been teaching myself assembly language for a few months now, and I got to the point where I was playing around with linked lists. The following x86_64 intel assembly function crashes (SEGV) when it tries to free a struct pointer, but the first free call succeeds. I should note that this code ran perfectly fine on my old Linux mint machine (I am now running Debian 12 but it also crashes on macos).

global _ft_list_remove_if
extern free

section .text

_ft_list_remove_if:

    push rbp
    mov rbp, rsp
    sub rsp, 32
    mov QWORD [rsp], rdi
    mov QWORD [rsp + 8], rsi
    mov QWORD [rsp + 16], rdx
    mov QWORD [rsp + 24], rcx

    mov r10, QWORD [rdi]            ; r10 = *begin_list

.rm_loop:
    cmp r10, 0
    je .restore_stack
    mov r11, QWORD [r10 + 8]        ; r11 = r10->next
    cmp r11, 0
    je .restore_stack

    mov rdx, QWORD [rsp + 16]       ; rdx = cmp
    mov rsi, QWORD [rsp + 8]        ; rsi = data_ref
    mov rdi, QWORD [r11]            ; rdi = r11->data
    push r10
    push r11
    call rdx

    pop r11
    pop r10
    cmp eax, 0
    jne .increment_ptr

    mov rax, QWORD [r11 + 8]        ; rax = r11->next
    mov QWORD [r10 + 8], rax        ; r10->next = rax

    ; remove
    mov rcx, QWORD [rsp + 24]       ; rcx = free_fct
    mov rdi, QWORD [r11]            ; rdi = r11->data
    push r10
    push r11
    call rcx

    pop r11
    mov rdi, r11
    push r11
    call free wrt ..plt
    pop r11
    pop r10

.increment_ptr:
    mov r10, QWORD [r10 + 8]        ; r10 = r10->next
    jmp .rm_loop

.restore_stack:
    add rsp, 32
    pop rbp

.endfunc:
    ret

Here is the definition of the struct:

typedef struct s_list
{
    void            *data;
    struct s_list   *next;
}   t_list;

where data is always a malloc'ed pointer. I use strcmp as cmp and free as free_fct. I assemble with nasm:2.16.01 and then link with gcc:12.2.0.

matteobu02
  • 53
  • 5
  • Edit the question to provide a [mre]. – Eric Postpischil Aug 01 '23 at 23:08
  • 2
    Also, what did the debugger tell you? Anyway, one issue is that `r10` and `r11` are not preserved across calls, so your calls to `cmp`, `free_fct` and `free` may destroy them. – Jester Aug 01 '23 at 23:10
  • compare to: https://godbolt.org/z/3sdora9nd – 0___________ Aug 01 '23 at 23:15
  • @Jester Good catch, thank you! It doesn't solve my problem though. When gdb reaches the second `free` call, rdi contains the same address printed by gdb when I do `p/x $r11` at any point during the function, which is the same as `r11->next`. So that seems correct to me. – matteobu02 Aug 01 '23 at 23:46
  • @0___________ well, I don't really see a difference :/ – matteobu02 Aug 01 '23 at 23:54
  • So which instruction is crashing? – Jester Aug 01 '23 at 23:58
  • `call free wrt ..plt` – matteobu02 Aug 02 '23 at 00:02
  • 1
    Note your C code has a `continue;` so it does not fall through into `ptr = ptr->next` as your asm code does. Try adding a `jmp .rm_loop` before the `.increment_ptr` – Jester Aug 02 '23 at 00:12
  • The `valgrind` debugging utility may be able to give you a more precise error. – zwol Aug 02 '23 at 00:28
  • @Jester ohh you're right.. again. Yes, I totally missed that! And now, with this fix and the one you pointed out earlier, the code still crashes but it's way further down into the function. That's already really helpful, thank you so much! I'll update the question as soon as I fix the last bug. – matteobu02 Aug 02 '23 at 00:44

1 Answers1

1

Okayy, there were several issues with my code: I didn't save the r10 and r11 registers before some function calls, which probably changed their value. Though, I'm still a bit confused about that because gdb still shows the same addresses as when the code was buggy. Then, I should have added jmp .rm_loop right after call free wrt ..plt, otherwise my function would skip a node if the previous one was to be deleted. At this point, the bugs in the code I provided were fixed. There was still a typo lurking around: I tried accessing a stack variable using rbp instead of rsp. As I said, the typo is not in the code snipped but I'll leave this here as a reminder :)

matteobu02
  • 53
  • 5
  • *Though, I'm still a bit confused about that because gdb still shows the same addresses as when the code was buggy.* - Sometimes code that makes unsafe assumptions about stuff like calling conventions happens to work anyway. It's still a bug even if it wasn't actually causing a problem for this function with this build of glibc. "Happens to work" does *not* mean "correct". See [What are callee and caller saved registers?](https://stackoverflow.com/a/56178078) / [What registers are preserved through a linux x86-64 function call](https://stackoverflow.com/q/18024672) – Peter Cordes Aug 02 '23 at 01:08
  • Also, with lazy dynamic linking (via the PLT), the *first* call in a program to a library function will destroy the value in R11, since it's the only integer register that's never used for passing anything and is call-clobbered, so the dynamic linker resolver uses it as a scratch register. After it updates the GOT entry, subsequent calls go to the function without running that code. – Peter Cordes Aug 02 '23 at 01:11