2

im having some trouble writing my own atoi function in assembly. The instructions are

"Change the function so that it returns the integer equivalent of the C-string (pointer) that is passed into the function. You may assume that the first character is between ‘0’ and ‘9’, inclusive. atoishould consider all the characters from the first up until the first character that is not a decimal digit. As you can see, mainuses the value returned by atoias an exit code (this is just a cheap way of accessing the output from atoi, without writing an itoafunction.) As given to you, atoireturns 1234. The return value is ANDedwith 0xFF to reduce it to a byte. Thus 1234 & 255 becomes 210."

    # Useful constants 
    .equ    STDIN,0 
    .equ    STDOUT,1 
    .equ    READ,0 
    .equ    WRITE,1 
    .equ    EXIT,60 
# Stack frame 
    .equ    bufferSize, 32
    .equ    buffer,-bufferSize
    .equ    localSize,16 
    .equ    frameSize, bufferSize + localSize
# Read only data 
    .section    .rodata # the read-only data section 
prompt: 
    .string     "Enter an integer: " 
    .equ    promptSz,.-prompt-1 
msg: 
    .string     "You entered: " 
    .equ    msgSz,.-msg-1 

Code

    .text   # switch to text section 


    .globl  __start 
 __start: 
    pushq   %rbp    # save caller’s frame pointer 
    movq    %rsp, %rbp  # establish our frame pointer 
    subq    $frameSize, %rsp    # for local variables 

    movl    $promptSz, %edx # prompt size 
    movl    $prompt, %esi   # address of prompt text string 
    movl    $STDOUT, %edi   # standard out 
    movl    $WRITE, %eax 
    syscall     # request kernel service 

    movl    $bufferSize,%edx
    leaq    buffer(%rbp), %rsi  # load buffer address
    movl    $STDIN, %edi    # standard in 
    movl    $READ, %eax 
    syscall     # request kernel service 
    movl    %eax, (%rsp)    # store num chars read

    leaq    buffer(%rbp), %rsi  # load buffer address
    call    atoi    # our exit code will be the return from atoi

    movq    %rbp, %rsp  # delete local variables 
    popq    %rbp    # restore caller’s frame pointer 
    movl    %eax, %edi  # put exit status in %edi (will be ANDed with FF)
    movl    $EXIT, %eax # exit from this process 

    syscall

the base code looks like this where i just have to implement my own atoi. so far what i have for the atoi function is

atoi:
    pushq   %rbp    # save caller’s frame pointer 
    movq    %rsp, %rbp  # establish our frame pointer 
    subq    $16, %rsp   # for local variables

    movq    %rdi, -16(%rbp) #moving first argument to local variable
    movl    $0, -4(%rbp) #moving 0 to local variable
    movl    $10, -12(%rbp) #moving 10 to local variable

    movl    -16(%rbp), %rax
    movzbl  (%rax), %eax #getting value of rax
    movl    -4(%rbp), %eax

    imull   -12(%rbp), %eax
    movl    %eax,   -4(%rbp)

    movq    %rbp, %rsp  # delete local variables 
    popq    %rbp    # restore caller’s frame pointer 
    ret

im at a loss for where to go next. it seems anything i do just gives me segmentation faults

fuz
  • 88,405
  • 25
  • 200
  • 352
  • [NASM Assembly convert input to integer?](https://stackoverflow.com/a/49548057) has a good implementation, no stack space needed. – Peter Cordes Oct 22 '20 at 11:08

1 Answers1

0

You're over-using local variables (and under-using registers); will need a loop that stops when it finds an invalid character; and probably using the wrong calling conventions (the syscalls look like Linux, which implies System V AMD64 ABI, which means parameters are passed in registers and not on the stack).

Note that this can all be done without any local variables at all. For example (NASM syntax because I don't do AT&T, untested):

;Convert string to integer
;
;Input
; rdi = first parameter (address of string)
;
;Output
; rax = result

atoi:
    xor rax,rax               ;rax = 0 (this will become the returned result)
.nextChar:
    movzx rcx,byte [rdi]      ;rcx = next character
    sub rcx,'0'               ;rcx = value of next digit
    jb .done                  ;Invalid character (too low to be a decimal digit)
    cmp rcx,9                 ;Was it too high to be a decimal digit?
    ja .done                  ; yes, invalid

    lea rax,[rax*4+rax]       ;rax = result*5
    lea rax,[rax*2+rcx]       ;rax = result*5*2 + digit = result*10 + digit
    inc rdi                   ;rdi = address of next character
    jmp .nextChar
.done:
     ret

Note: This code will not work for negative values (e.g. strings that begin with '-') and won't return an error condition if/when the result overflows. The result will also be 64-bit (while an int is probably supposed to be 32-bit). Mostly, it's a "convert to unsigned long long" (with error handling that's as bad as atoi()).

Brendan
  • 35,656
  • 2
  • 39
  • 66
  • OPs code does read its parameters from registers. Your answer addresses no mistake in OPs code at all. Downvote. – fuz Apr 08 '19 at 06:58
  • @fuz: You're right about the calling conventions (it was too hard to notice due to all the other stack accesses); but are you sure that OP isn't over-using local variables/stack and won't need a loop?? – Brendan Apr 08 '19 at 09:21
  • It doesn't matter that OP stores all sorts of things on the stack; that's what the stack is for. It might be inefficient to do it this way, but it is not wrong. OP asked about what is wrong with his code, not how to make it more efficient. That's why your answer fails to address the question. – fuz Apr 08 '19 at 10:27
  • @fuz: No, OP asked how to write the code, I said a loop that stops when an invalid character is found will be needed, and provided an example (that can't be cut&pasted without thinking about it) showing a loop that stops when an invalid character is found. The unnecessary stack use (that makes it hard to read/maintain and slower) is just a bonus (to ensure OP learns what they need to learn and doesn't just learn what they asked). – Brendan Apr 09 '19 at 02:44
  • I don't see a problem with this answer. We aren't robots, we're humans. @Brendan answered the question in a way he felt fit. Only qualm I'd have is hopefully the OP knows AT&T and Intel syntaxes exist, but I'm sure they'll figure it out. I would be concerned if this answer didn't answer the original question and *only* mentioned the stylistic/editorial stuff, but that's not the case here. I also appreciate the Note at the end, which further improves the quality of the answer. – the_endian Dec 25 '20 at 00:40