1

I am new to assembly and am aware that my assembly code may not be efficient or could be better. The comments of the assembly may be messed up a little due to constant changes. The goal is to print each character of the string individually and when comes across with a format identifier like %s, it prints a string from one of the parameters in place of %s. So for example:

String: Hello, %s

Parameter (RSI): Foo

Output: Hello, Foo

So the code does what it suppose to do but give segmentation error at the end.

.bss
char:   .byte 0 

.text

.data
text1:      .asciz "%s!\n" 
text2:      .asciz "My name is %s. I think I’ll get a %u for my exam. What does %r do? And %%?\n"
word1:      .asciz "Piet"



.global main 


main:

pushq   %rbp            # push the base pointer (and align the stack)
movq    %rsp, %rbp      # copy stack pointer value to base pointer

movq    $text1, %rdi   
movq    $word1, %rsi  
movq    $word1, %rdx  
movq    $word1, %rcx  
movq    $word1, %r8  
movq    $word1, %r9  
call    myPrint 


end:
movq    %rbp, %rsp      # clear local variables from stack
popq    %rbp            # restore base pointer location 

movq    $60, %rax 
movq    $0, %rdi 

syscall 



myPrint: 
pushq   %rbp 
movq    %rsp, %rbp

pushq   %rsi  
pushq   %rdx 
pushq   %rcx 
pushq   %r8 
pushq   %r9 

movq    %rdi, %r12 
regPush: 
movq    $0, %rbx 

#rbx: counter 
printLooper: 
movb    (%r12), %r14b    #Get a byte of r12 to r14 

cmpb    $0, %r14b        #Check if r14 is a null byte 
je      endPrint        #If it is a null byte then go to 'endPrint'

cmpb    $37, %r14b 
je      formatter

incq    %r12            #Increment r12 to the next byte 

skip: 
mov     $char, %r15     #Move char address to r15 
mov     %r14b, (%r15)   #Move r14 byte into the value of r15 
mov     $char, %rcx     #Move char address into rcx 
movq    $1, %r13        #For the number of byte 

printer: 
movq    $0, %rsi        #Clearing rsi 
mov     %rcx, %rsi      #Move the address to rsi 
movq    $1, %rax        #Sys write 
movq    $1, %rdi        #Output
movq    %r13, %rdx      #Number of byte to rdx   
syscall 

jmp     printLooper


formatter: 
incq    %r12            #Moving to char after "%"

movb    (%r12), %r14b   #Moving the char byte into r14 

cmpb    $115, %r14b      #Compare 's' with r14
je      formatString    #If it is equal to 's' then jump to 'formatString'

movb    -1(%r12), %r14b #Put back the previous char into r14 

jmp     skip            


####String Formatter Start ##################################################

formatString: 
addq    $1, %rbx 
movq    $8, %rax 
mulq    %rbx 
subq    %rax, %rbp 
movq    (%rbp), %r15
pushq   %r15            ### into the stack 
movq    $0, %r13        ### Byte counter 


formatStringLoop: 
movb   (%r15), %r14b    #Move char into r14 

cmpb    $0, %r14b        #Compare r14 with null byte 
je      formatStringEnd #If it is equal, go to 'formatStringEnd'

incq    %r15            #Increment to next char 
addq    $1, %r13        #Add 1 to the byte counter 
jmp     formatStringLoop#Loop again 

formatStringEnd: 
popq    %rcx            #Pop the address into rcx 
incq    %r12            #Moving r12 to next char 
jmp     printer         

#######String Formatter End #############################################


endPrint: 
movq    %rbp, %rsp 
popq    %rbp 
ret 


Kevin Feng
  • 23
  • 2
  • 1
    Which instruction exactly segfaults? What values are in registers when that happens? Use GDB or any other debugger to find out. – Peter Cordes Oct 23 '20 at 00:54
  • @Peter Cordes So I used the debugger and it gives segmentation error the moment function 'myPrint' is returned. Based on the rbp and rip value in the function 'myPrint', it seems that the stack was properly cleared in 'myPrint'. I know this because at the time of clearing, rip points to where rbp was pushed. Which means rbp was popped properly. As for the values in the registers, I do not know what the numbers from each register means – Kevin Feng Oct 23 '20 at 01:28

1 Answers1

2

In formatString you modify %rbp with subq %rax, %rbp, forgetting that you will restore %rsp from it. So when you mov %rbp, %rsp just before the function returns, you end up with %rsp pointing somewhere else, and so you get the wrong return address.

I guess you are subtracting some offset from %rbp to get some space on the stack. This seems unsafe because you've pushed lots of other stuff there. It is safe to use up to 128 bytes below the stack pointer as this is the red zone, but it would be more natural to use an offset from %rsp instead. Using SIB addressing you can access data at constant or variable offsets to %rsp without actually changing its value.

How I found this with gdb: by setting breakpoints at myPrint and endPrint, I found that %rsp was different at the ret than it was on entry. Its value could only have come from %rbp, so I did watch $rbp to have the debugger break when %rbp changed, and it pointed straight to the offending instruction in formatString. (Which I could also have found by searching the source code for %rbp.)


Also, your .text at the top of the file is misplaced, so all your code gets placed in the .data section. This actually works but it surely is not what you intended.

Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • More importantly, you reserve fixed or variable amounts of stack space with `sub , %rsp`, not `%rbp`, whether you're using the red-zone or not. Moving RBP doesn't "allocate" anything, and can get super confusing if you make a `call` that clobbers space below the original RSP that you're addressing relative to RBP. (Although if you have some fixed locations and one variable-size allocation, probably easiest to address the variable-sized allocation relative to RSP. Or `-128(%rsp, )` to use the red zone for it.) – Peter Cordes Oct 23 '20 at 02:20
  • By SIB Addressing do you mean like: mov (%rbp + 8), %rsi right? this is AT&T format. Also, the rbp must have the same value until it is popped right? – Kevin Feng Oct 23 '20 at 02:47
  • @KevinFeng: In AT&T format it looks like `mov 8(%rbp), %rsi`. But I am suggesting instead something like `mov -16(%rsp, %rax, 8), %rsi` since you appear to want a variable offset scaled by 8. – Nate Eldredge Oct 23 '20 at 03:26
  • @KevinFeng: Yes, you should generally leave `%rbp` alone. If you must change it then it should be saved first and restored afterward (i.e. push/pop). – Nate Eldredge Oct 23 '20 at 03:27