4

Does anyone know how the std::sqrt() function works? (or at least have an idea?)

I've seen methods on the internet that seemed really slow, using lots of approximations and iterations.

Everyone knows sqrt() function is slow, but I'd like to know how the one from std works so I could have a vague idea of when it is beneficial to avoid it. (yes, if I want to be sure I can profile, but it's still nice to have a vague idea)

EDIT: Didn't really formulate the question too well... What I'm interested in:

  • how would the fastest C++ function calculating a square root look like? (more or less, I just want to know the actual logic behind it)
xcrypt
  • 3,276
  • 5
  • 39
  • 63
  • 1
    The lazy answer is that there's probably a hardware (FPU) instruction to compute the square root of a floating point value. But of course software emulations exist. – Kerrek SB Dec 19 '11 at 02:00
  • @moshbear: That's for reciprocal sqrt, unfortunately... – Oliver Charlesworth Dec 19 '11 at 02:02
  • @OliCharlesworth ... which is just one FP-multiply away from actual square root – moshbear Dec 19 '11 at 02:08
  • @xcrypt http://en.wikipedia.org/wiki/Fast_inverse_square_root#The_.22magic_number.22, http://en.wikipedia.org/wiki/Fast_inverse_square_root#History_and_investigation – moshbear Dec 19 '11 at 02:11
  • The fastest C++ function for calculating a square root would look like this: `double sqrt(double x) { return std::sqrt(x); }` – Mankarse Dec 19 '11 at 02:11
  • @Mankarse Wow, that's solid! :p – xcrypt Dec 19 '11 at 02:14
  • 1
    Possible duplicate of [How is the square root function implemented?](https://stackoverflow.com/q/3581528/608639) – jww Nov 05 '18 at 10:05

4 Answers4

3

Nowadays, on modern machines, floating point functions are passed off to the hardware (floating point unit or math-coprocessor).

Steve Wellens
  • 20,506
  • 2
  • 28
  • 69
  • A bit of a misnomer to call it a "coprocessor" don't you think? I haven't seen it separated out of the main CPU in over 20 years. – Mark Ransom Dec 19 '11 at 02:17
  • That's why I CMA and included 'floating point unit'. I bet you've used a '286 at one time in your life! – Steve Wellens Dec 19 '11 at 11:07
3

Sometimes, it uses what the CPU offers:

$ cat main.cc
#include <cmath>
#include <ctime>
#include <cstdlib>
int main(){
    srand (clock());
    const double d = rand();
    return std::sqrt(d) > 2 ? 1 : 0;
}

(the blahblah is just so nothing relevant is optimized away, don't run that program!)

$ g++ -S main.cc
$ cat main.s
    .file   "main.cc"
    .text
    .p2align 4,,15
.globl main
    .type   main, @function
main:
.LFB106:
    .cfi_startproc
    subq    $8, %rsp
    .cfi_def_cfa_offset 16
    call    clock
    movl    %eax, %edi
    call    srand
    call    rand
    cvtsi2sd    %eax, %xmm1
    sqrtsd  %xmm1, %xmm0
    ucomisd %xmm0, %xmm0
    jp  .L5
.L2:
    xorl    %eax, %eax
    ucomisd .LC0(%rip), %xmm0
    seta    %al
    addq    $8, %rsp
    .cfi_remember_state
    .cfi_def_cfa_offset 8
    ret
.L5:
    .cfi_restore_state
    movapd  %xmm1, %xmm0
    call    sqrt
    jmp .L2
    .cfi_endproc
.LFE106:
    .size   main, .-main
    .section    .rodata.cst8,"aM",@progbits,8
    .align 8
.LC0:
    .long   0
    .long   1073741824
    .ident  "GCC: (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2"
    .section    .note.GNU-stack,"",@progbits

(hint: it is using a sqrt-cpu-instruction)

Sebastian Mach
  • 38,570
  • 8
  • 95
  • 130
  • 1
    It doesn't _just_ use the `sqrtsd` instruction by the way, it also check if the result is unordered against itself (eg, NaN) and then calls the actual library function if this is so. Not sure why that's needed, it may be to get extra info not provided by the instruction itself, possibly exception handling? – paxdiablo Dec 23 '11 at 14:00
  • @paxdiablo: Now that you say it, I wonder, too. I couldn't find enough information about the specifics of the sqrtsd instruction. – Sebastian Mach Dec 23 '11 at 14:19
1

sqrt(); function Behind the scenes.

It always checks for the mid-points in a graph. Example: sqrt(16)=4; sqrt(4)=2;

Now if you give any input inside 16 or 4 like sqrt(10)==?

It finds the mid point of 2 and 4 i.e = x ,then again it finds the mid point of x and 4 (It excludes lower bound in this input). It repeats this step again and again until it gets the perfect answer i.e sqrt(10)==3.16227766017 .It lies b/w 2 and 4.All this in-built function are created using calculus,differentiation and Integration.

lifeisbeautiful
  • 817
  • 1
  • 8
  • 18
0

The standard does not specify a particular implementation.

One option is to look at a typical implementation, but you'll probably find it's heavily-optimised assembler.

Oliver Charlesworth
  • 267,707
  • 33
  • 569
  • 680
  • Well, I knew that, but let's assume that it uses the fastest C++ function to get calculate it, how would it look like? – xcrypt Dec 19 '11 at 02:01
  • 1
    @xcrypt: Honestly, I don't know what algorithm libraries tend to use. I would take a look at http://en.wikipedia.org/wiki/Methods_of_computing_square_roots to get an idea of some of the techniques out there. And indeed, they all rely on iterations or approximations. – Oliver Charlesworth Dec 19 '11 at 02:04
  • 2
    @xcrypt: since all square root computation algorithms are based on numerical approximations, faster almost directly implies imprecise. I doubt you want to find the "fastest" implementation out there. – André Caron Dec 19 '11 at 02:06