Is the Fast Inverse Square Root Algorithm faster than C++'s standard library sqrt() function?

Question

so recently I stumbled upon the fast inverse square root algorithm that goes like this.

float inverse_rsqrt( float number )
{
    const float threehalfs = 1.5F;
    float x2 = number * 0.5F;
    float y = number;
    long i = * ( long * ) &y;
    i = 0x5f3759df - ( i >> 1 );
    y = * ( float * ) &i;
    y = y * ( threehalfs - ( x2 * y * y ) );
  
    return y;

I thought of it as a great optimization to my Ray Tracer as I have to normalize a lot of vectors, and I would just have to switch out this bit of code.

Vec3D normalize(Vec3D vector){
    return vector/(sqrt((vector[0]*vector[0])+(vector[1]*vector[1])+(vector[2]*vector[2])));
}

However, when I implemented the code it ends up taking the same amount of time. Here is the implementation I use.

Vec3D normalize(const Vec3D& rhs) {     // Normalize Vector

      float num = rhs[0]*rhs[0]+rhs[1]*rhs[1]+rhs[2]*rhs[2];
      const float threehalfs = 1.5F;
      float y = num;
      long i = * (long *) &y;
    
      i = 0x5f3759df - (i >> 1);
      y = * (float *) &i;
    
      y = y*(threehalfs - ((num*0.5F)*y*y));
    
      return rhs*y;

I was wondering if the sqrt function in the standard c++ cmath library is just as good as the fast inverse square root algorithm or maybe it isn't and I'm missing a key detail.

note: Vec3D is just a vector of size three that has x, y, z parameters, and I overloaded the * operator so that when a double is multiplied by a vector it takes the scalar multiple of the vector.

`std::sqrt` is bound by accuracy requirements that your method isn't, so it's almost certainly slower. Also it does a different operation so... — Mooing Duck, Apr 26 '21 at 00:56
The fast inverse square root was invented precisely because it was faster than the library square root of the time. I think the use case was similar to yours too. It's possible that processor improvements in the intervening years has made sqrt much faster. — Mark Ransom, Apr 26 '21 at 01:13
Re much faster sqrt in hardware: See Agner Fog's instruction tables: https://www.agner.org/optimize/instruction_tables.pdf — njuffa, Apr 26 '21 at 01:42
"The" standard C++ math library? There's not just one, every platform has its own, and they might all have different implementations with different performance. — Nate Eldredge, Apr 26 '21 at 01:44
The other problem is that your implementation causes undefined behavior since it violates the [strict aliasing rule](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule). — Nate Eldredge, Apr 26 '21 at 01:45
Also, the Standard Library **itself** may violate the strict aliasing rule because it's part of the implementation. — MSalters, Apr 26 '21 at 08:45
See https://stackoverflow.com/questions/4866913/sse-normalization-slower-than-simple-approximation — MSalters, Apr 26 '21 at 08:48

score 1 · Answer 1 · answered Apr 26 '21 at 08:50

Chances are that your compiler doesn't even use std::sqrt. Yes, you wrote it, but this normalize operation is a well-known operation that can be hardware-accelerated. x86 has a built-in inverse sqrt nowadays, so that saves you not just the square root but it also replaces a division by a multiplication.

Is the Fast Inverse Square Root Algorithm faster than C++'s standard library sqrt() function?

1 Answers1