2

I'm trying to write an x86 version of the 'cat' program as a training of syscall calls in assembly.

I'm struggling a lot with command line arguments. I use the main symbol as an entry point, so I thought I would find the argc parameter in %rdi and the argv parameter in %rsi. Actually argc is in %rdi as expected, but I keep segfaulting when trying to pass argv[1] to the open syscall.

Not sure of what I'm doing wrong, here is my assembly code:

main:
    cmp $2, %rdi            // If argc != 2 return 1
    jne .err1

    lea 8(%rsi), %rdi       // Move argv[1] -> %rdi
    xor %rsi, %rsi          // 0 to %rsi -> O_RDONLY
    xor %rdx, %rdx
    mov $2, %rax            // Open = syscall 2
    syscall

    cmp 0, %rax             // If open returns <0 -> exit status 2
    jl .err2

    mov %rax, %rdi          // Move fd to %rdi
    call cat
    ret

.err1:
    mov $1, %rax
    ret
.err2:
    mov $2, %rax
    ret
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
JM445
  • 168
  • 1
  • 12
  • How do you assemble and link your code? If you don't link with the libc, there's nothing that places an argument count into `esi` or a pointer to the arguments into `rdi`. Instead, you'll have to retrieve the arguments from where the program loader puts them on the stack. The name of the entry point doesn't matter then. – fuz Jan 13 '22 at 14:06
  • I assemble using gcc: gcc -o cat cat.S -no-pie – JM445 Jan 13 '22 at 14:09
  • 1
    Okay, then your code should work. Note that `lea 8(%rsi), %rdi` retrieves a pointer to the pointer to the second argument. If you want to obtain a pointer to the string itself (which it looks like it is what you are trying to do), use `mov 8(%rsi), %rdi`. – fuz Jan 13 '22 at 14:12
  • Does this answer your question? [Linux 64 command line parameters in Assembly](https://stackoverflow.com/questions/3683144/linux-64-command-line-parameters-in-assembly) –  Jan 13 '22 at 14:13

1 Answers1

4

There are two issues with your code.

First, you use lea 8(%rsi), %rdi to retrieve the second argument. Note that rsi points to an array of pointers to command line arguments, so to retrieve the pointer to the second argument, you have to dereference 8(%rsi) using something like mov 8(%rsi), %rdi.

Second, you forgot the dollar sign in front of 0 in cmp $0, %rax. This causes an absolute address mode for address 0 to be selected, effectively dereferencing a null pointer. To fix this, add the missing dollar sign to select an immediate addressing mode.

When I fix both issues, your code as far as you posted it seems to work just fine.

fuz
  • 88,405
  • 25
  • 200
  • 352