3

I have been working on essentially a while loop to go through all CLI arguments. While working on solution to only print 1 element I noticed a few things; this was the thought process that led me to here.

I noticed that if I did lea 16(%rsp), %someRegisterToWrite, I was able to get/print argv[1]. Next I tried lea 24(%rsp), %someRTW and this gave me access to argv[2]. I kept going up to see if it would continue to work and it did.

My thought was to keep adding 8 to %someRTW and increment a "counter" until the counter was equal to argc. This following code works great when a single argument is entered but prints nothing with 2 arguments and when I enter 3 arguments, it will print the first 2 with no space in between.

.section __DATA,__data
.section __TEXT,__text
.globl _main
_main:
    lea (%rsp), %rbx        #argc
    lea 16(%rsp), %rcx      #argv[1]
    mov $0x2, %r14          #counter
    L1:
    mov (%rcx), %rsi        #%rsi = user_addr_t cbuf
    mov (%rcx), %r10
    mov 16(%rcx), %r11      
    sub %r10, %r11          #Get number of bytes until next arg
    mov $0x2000004, %eax    #4 = write
    mov $1, %edi            #edi = file descriptor 
    mov %r11, %rdx          #user_size_t nbyte
    syscall
    cmp (%rbx), %r14        #if counter < argc
    jb L2
    jge L3
    L2:
    inc %r14                
    mov 8(%rcx), %rcx       #mov 24(%rsp) back into %rcx
    mov $0x2000004, %eax
    mov $0x20, %rsi         #0x20 = space
    mov $2, %rdx
    syscall
    jmp L1
    L3:
    xor %rax, %rax
    xor %edi, %edi
    mov $0x2000001, %eax
    syscall
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Jlegend
  • 531
  • 1
  • 6
  • 19
  • 3
    `syscall` clobbers `rcx`. PS: learn to use a debugger. – Jester May 13 '16 at 12:30
  • And `%r11`, which the OP also uses. – Peter Cordes May 13 '16 at 12:42
  • @PeterCordes yes but not across the `syscall` – Jester May 13 '16 at 13:35
  • Beyond what has been mentioned already sys_write syscall doesn't print characters directly, it takes a pointer to a buffer containing the characters. `mov $0x20, %rsi ` in the second syscall will not work for that reason. _RSI_ needs to be a **pointer** to a buffer that contains a space, and I'm not sure why you want to print 2 characters with that syscall when you do `mov $2, %rdx`. – Michael Petch May 13 '16 at 14:36
  • 1
    It also appears you are trying to determine the length of a command line parameter with this `mov 16(%rcx), %r11` `sub %r10, %r11`. If that is attempting to get the string length it won't work because argument pointers may not be at contiguous locations. When you retrieve an address to the beginning of a parameter you should find the string length by scanning down the string looking for a NUL(\0) character. – Michael Petch May 13 '16 at 15:11

2 Answers2

3

I am going to assume that on 64-bit OS/X you are assembling and linking in such away that you intentionally want to bypass the C runtime code. One example would be to do a static build without the C runtime startup files and the System library, and that you are specifying that _main is your program entry point. _start is generally the process entry point unless overridden.

In this scenario the 64-bit kernel will load the macho64 program into memory and set up the process stack with the program arguments, and environment variables among other things. Apple OS/X process stack state at startup is the same as what is documented in the System V x86-64 ABI in Section 3.4:

Initial Process Stack

One observation is that the list of argument pointers is terminated with a NULL(0) address. You can use this to loop through all parameters until you find the NULL(0) address as an alternative to relying on the value in argc.


The Problems

One problem is that your code assumes that registers are all preserved across a SYSCALL. The SYSCALL instruction itself will destroy the contents of RCX and R11:

SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction following SYSCALL into RCX). (The WRMSR instruction ensures that the IA32_LSTAR MSR always contain a canonical address.)

SYSCALL also saves RFLAGS into R11 and then masks RFLAGS using the IA32_FMASK MSR (MSR address C0000084H); specifically, the processor clears in RFLAGS every bit corresponding to a bit that is set in the IA32_FMASK MSR

One way to avoid this is to try and use registers other than RCX and R11. Otherwise you will have to save/restore them across a SYSCALL if you need their values to be untouched. The kernel will also clobber RAX with a return value.

A list of the Apple OS/X system calls provides the details of all the available kernel functions. In 64-bit OS/X code each of the system call numbers has 0x2000000 added to it:

In 64-bit systems, Mach system calls are positive, but are prefixed with 0x2000000 — which clearly separates and disambiguates them from the POSIX calls, which are prefixed with 0x1000000


Your method to compute the length of a command line argument will not work. The address of one argument doesn't necessarily have to be placed in memory after the previous one. The proper way is to write code that starts at the beginning of the argument you are interested in and searches for a NUL(0) terminating character.


This code to print a space or separator character won't work:

mov 8(%rcx), %rcx       #mov 24(%rsp) back into %rcx
mov $0x2000004, %eax
mov $0x20, %rsi         #0x20 = space
mov $2, %rdx
syscall

When using the sys_write system call the RSI register is a pointer to a character buffer. You can't pass an immediate value like 0x20 (space). You need to put the space or some other separator (like a new line) into a buffer and pass that buffer through RSI.


Revised Code

This code takes some of the ideas in the previous information and additional cleanup, and writes each of the command line parameters (excluding the program name) to standard output. Each will be separated by a newline. Newline on Darwin OS/X is 0x0a (\n).

# In 64-bit OSX syscall numbers = 0x2000000+(32-bit syscall #)
SYS_EXIT  = 0x2000001
SYS_WRITE = 0x2000004

STDOUT    = 1

.section __DATA, __const
newline: .ascii "\n"
newline_end: NEWLINE_LEN = newline_end-newline

.section __TEXT, __text
.globl _main
_main:
    mov (%rsp), %r8             # 0(%rsp) = # args. This code doesn't use it
                                #    Only save it to R8 as an example.
    lea 16(%rsp), %rbx          # 8(%rsp)=pointer to prog name
                                # 16(%rsp)=pointer to 1st parameter
.argloop:
    mov (%rbx), %rsi            # Get current cmd line parameter pointer
    test %rsi, %rsi
    jz .exit                    # If it's zero we are finished

    # Compute length of current cmd line parameter
    # Starting at the address in RSI (current parameter) search until
    # we find a NUL(0) terminating character.
    # rdx = length not including terminating NUL character

    xor %edx, %edx              # RDX = character index = 0
    mov %edx, %eax              # RAX = terminating character NUL(0) to look for
.strlenloop:
         inc %rdx               # advance to next character index
         cmpb %al, -1(%rsi,%rdx)# Is character at previous char index
                                #     a NUL(0) character?
         jne .strlenloop        # If it isn't a NUL(0) char then loop again
    dec %rdx                    # We don't want strlen to include NUL(0)

    # Display the cmd line argument
    # sys_write requires:
    #    rdi = output device number
    #    rsi = pointer to string (command line argument)
    #    rdx = length
    #
    mov $STDOUT, %edi
    mov $SYS_WRITE, %eax
    syscall

    # display a new line
    mov $NEWLINE_LEN, %edx
    lea newline(%rip), %rsi     # We use RIP addressing for the
                                #     string address
    mov $SYS_WRITE, %eax
    syscall

    add $8, %rbx                # Go to next cmd line argument pointer
                                #     In 64-bit pointers are 8 bytes
    # lea 8(%rbx), %rbx         # This LEA instruction can replace the
                                #     ADD since we don't care about the flags
                                #     rbx = 8 + rbx (flags unaltered)
    jmp .argloop

.exit:
    # Exit the program
    # sys_exit requires:
    #    rdi = return value
    #
    xor %edi, %edi
    mov $SYS_EXIT, %eax
    syscall

If you intend to use code like strlen in various places then I recommend creating a function that performs that operation. I have hard coded strlen into the code for simplicity. If you are looking to improve on the efficiency of your strlen implementation then a good place to start would be Agner Fog's Optimizing subroutines in assembly language.

This code should compile and link to a static executable without C runtime using:

gcc -e _main progargs.s -o progargs -nostartfiles -static
Michael Petch
  • 46,082
  • 8
  • 107
  • 198
  • The one part I am having a problem understanding is the `cmpb $0, -1(%rsi,%rdx)`. I am reading this as `if (%rsi[%rdx -1] == 0)`. Is that correct? – Jlegend May 16 '16 at 20:22
  • It would be similar to comparing the byte at memory address `rsi+rdx-1` with 0 . Where _RSI_ is the start of the string _RDX_ is the current pos, and -1 is to go back one. The `-1` is because we increment _RDX_ first on every loop so we must do the compare with the previous byte. Which is pretty much what I think you mean by `if (%rsi[%rdx -1] == 0)` – Michael Petch May 16 '16 at 21:03
2

As you've already figured out correctly, the first argument on stack is the number of arguments, the third and following are the cli-arguments. The second is by the way the actual name of the program. You do not have to care about argc, because you could poping the stack until the value is zero. An easy solution is:

add $0x10, %rsp
L0:
  pop %rsi
  or %rsi, %rsi
  jz L2
  mov %rsi, %rdi
  xor %rdx, %rdx
  L1:
    mov (%rsi), %al
    inc %rsi
    inc %rdx
    or %al, %al
  jnz L1
  ;%rdx - len(argument)
  ;%rdi - argument
  ;< do something with the argument >
  jmp L0
L2:

If you want a space or newline after each argument, just print it :).

lea (newline), %rsi
mov $0x02, %rdx
mov STDOUT, %rdi
mov sys_write, %rax
[...]
newline db 13, 10, 0

I am a bit confused about the syscall-numbers in %rax, but I guess its an OSX-thing? As Jester and Peter Cordes already mentioned, syscalls overwrites registers: %rcx with the return address (%rip) and %r11 with flags (%rflags). I recommend to have a look into the intel x86_64 docs: http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-manual-325462.pdf

Another thing with this code:

jb L2
jge L3
L2:

argc and counter are unsigned, so this looks a bit better, I guess:

jae L3

Sorry if the code does not work. I usual use intel-syntax and I did not test it, but I thing you get it :)

sivizius
  • 450
  • 2
  • 14
  • syscall-number in `%rax` is standard for the [x86-64 System V ABI](http://www.x86-64.org/documentation.html), used by Linux and OS X (and everything except Windows), although of course every OS has its own numbering system. See [the x86 tag wiki](http://stackoverflow.com/tags/x86/info) for links. Also, is that the entire 3-volume set in one PDF that you linked? If I was going to link a PDF directly, I'd link just the insn set reference (vol2). – Peter Cordes May 13 '16 at 21:35
  • Since you effectively do a pre-increment on the counter before branching, you should probably decrement %rdx once out of the loop or your string length will include the NUL terminating character. `newline db 13, 10, 0` in GNU Assembler would be `newline: .byte 13, 10, 0` but on modern OS/X newline is `0x0a` (13,10 would be Windows). It could have been declared as `newline: .asciz "\n"` . GNU assembler supports _C_ style strings. `.ascii` would work as well since the trailing `\0` NUL character isn't wanted with syscalls (sys_write doesn't stop printing when it reaches `\0` in the buffer) – Michael Petch May 13 '16 at 21:55
  • You could have also eliminated the usage of _RDX_ in the loop if you subtracted _%RDI_ from _%RSI_ after the loop (and subtracted one). That would yield the length (without nul terminator) as well. – Michael Petch May 13 '16 at 22:03
  • Thank you to everyone in this post! Great conversation, I am learning a lot. – Jlegend May 14 '16 at 16:45