How to compute the integer square root of a number in x86-64, without using div?

Question

I am very new to x86 and I'm trying to write a program that computes the integer square root of a number, building it bit by bit from the most significant to least significant. The only scratch registers I have are %rax, %rcx, %rdx, %rdi, %rsi, %r8, %r9, %r10 and %r11. I am not allowed to use any others. I am not quite sure how to copy the value from %edi into an r register, and then once all calculations are complete, recopy that value in the r register back into %eax to be returned.

Variables: %edi contains the argument x (unsigned 32-bit). %eax will carry the return value

This is all the code I have, I am sure it is full of mistakes, I am very new to this.

    .globl sqrt
sqrt:

    movl $0, %eax         #initializing return to 0
    movslq %edi, %rdi     #moving edi into rdi, not sure if this works 
    movq $0, %rax         #initializing scratch return to 0
    movq $15, %rcx        #initializing loop counter to 15(start at 15th bit)
    movq $0x80000, %rbx   #creating bit mask (1000 0000 0000 0000)

loop:

    xorq %rax, %rcx     #set specific bit to 1 in rcx
    pushq %rax          #temporarily store rax value in stack
    mulq %rax           #rax=rax*rac
    cmpq %rax, %rdi     #rax<=rdi
    popq %rax           #restore original rax value
    jbe keep_bit         #keep bit if rax<=rdi
    xorq %rax, %rcx      #unset bit in rax if rdi>rax

keep_bit:

    shr $1, %rcx        #shift to the next bit
    jnz loop            #continue to loop until all bits are tried

I know I need a line in here to load %rax value back into %eax to be returned but I'm not sure how to do that

Your code looks fine (but test it anyway). A few notes: to move a 32-bit unsigned immediate in a 64-bit register (e.g. `mov $15, %rcx`) you can move the immediate in the 32-bit register (`mov $15, %ecx`) since this will clear the upper 32 bits. To zero a register use `xorl` (same principle with the `mov`s above) and finally, I believe you don't need to handle imaginary results so it's safe to assume the input is unsigned and thus `movzlq %edi, %rdi` is more appropriate (or the equivalent `mov %edi, %edi`) — Margaret Bloom, Sep 28 '17 at 09:42
You clobber the caller's `%rbx` without saving/restoring it. But you never use the `0x80000` constant you put there. Also, you could use `imul %rcx, %rax` since you don't want the high-half result in `%rdx`. (your `mul` is squaring `rax`; I think you want `mul %rcx`. Single-step your code in `gdb`; see the bottom of https://stackoverflow.com/tags/x86/info) You could also use a different register instead of push/pop inside the loop! — Peter Cordes, Sep 28 '17 at 15:23
Also, if you actually wanted this to be fast, you'd convert to `double` and use `sqrtsd`, then convert back to integer. That would be much faster than a loop; only 3 instructions: https://godbolt.org/g/taEYGi — Peter Cordes, Sep 28 '17 at 15:25
Anyway, your actual question about `eax` vs. `rax` and `edi` vs. `rdi` is a duplicate of some existing register-subset questions. — Peter Cordes, Sep 28 '17 at 15:30
Oh, it was supposed to be `rax*rax`, the typo was in the comment. This implementation is confusing (as well as slow) because it uses `%rax` for 2 different things inside the loop using push/pop. (I hadn't seen the algo before, and wasn't sure if this one had more bugs or not). See https://stackoverflow.com/questions/46481471/whats-the-name-of-this-finding-square-root-algorithm for a follow-up asking about the algo's name, which presents a clean pseudo-code version. BTW, you could use `bts` / `btr` instead of shifting a mask. — Peter Cordes, Sep 29 '17 at 04:08

How to compute the integer square root of a number in x86-64, without using div?

0 Answers0

Linked