10

I've been warned by numerous programmers not to use the square root function, and instead to raise numbers to the half power. My question is twofold:

  1. What is the perceived/real performance benefit to doing this? Why is it faster?

  2. If it really is faster, why does the square root function even exist?

Athena
  • 3,200
  • 3
  • 27
  • 35
  • 4
    Raising to 1/2 is just as slow, they're both just `exp(ln(x)/2)`. – Blindy Mar 11 '14 at 17:57
  • 11
    What has your testing shown? – Adam Zuckerman Mar 11 '14 at 17:58
  • 6
    This sounds to me like "don't add subtract 1. instead, add a negative 1" – hometoast Mar 11 '14 at 18:04
  • 15
    Your first rule should always be to write code that is clear, maintainable, and makes sense. If you don't have performance problems, don't worry about performance. If you do have performance problems, then step one should always be to **profile** your application. When you've done that and you've identified the hotspots, then you can start to worry about what you can do to fix them. If you were already at that point then you would have had no problem swapping `sqrt` for `pow` and discovering first hand how much worse your situation became. – J... Mar 11 '14 at 20:02
  • J.'s comments have been so good. I wish credit could be given. – duffymo Mar 12 '14 at 00:42
  • Only partly related, but an interesting read is about the constant [`0x5f3759df`](https://stackoverflow.com/q/1349542/1364007). – Wai Ha Lee Dec 27 '17 at 17:00

2 Answers2

19

I've performed a simple test:

  Stopwatch sw = new Stopwatch();

  sw.Start();

  Double s = 0.0;

  // compute 1e8 times either Sqrt(x) or its emulation as Pow(x, 0.5)
  for (Double d = 0; d < 1e8; d += 1)
    // s += Math.Sqrt(d);  // <- uncomment it to test Sqrt
    s += Math.Pow(d, 0.5); // <- uncomment it to test Pow

  sw.Stop();

  Console.Out.Write(sw.ElapsedMilliseconds);

The (averaged) outcome at my workstation (x64) is

  Sqrt:  950 ms 
  Pow:  5500 ms

As you can see, more specific Sqrt(x) 5.5 times faster than its emulation Pow(x, 0.5). So it's just one more legend (at least in C#) that Sqrt is that slow one should prefer Pow substitution

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • 7
    When testing (especially) mathematical functions, it's also usually worthwhile to either state the target platform or to test both `x86` and `x64`. Because the hardware implementations are often very different there can sometimes be surprising differences (although not much, in this case). From the times, I would guess this example was `x64`. For `x86`, `Pow` is just a bit faster (but still *much* slower than `sqrt`). – J... Mar 11 '14 at 19:01
  • 4
    Note: 5.5 seconds for 100 million of the SLOWER operation. Less than a second for 100 million of the FAST operation. Your code will be the bottleneck, not these function calls. – duffymo Mar 11 '14 at 19:13
  • 5
    @duffymo That's a bit of a truism. Bottlenecks are always in "code" somewhere. If you happen to have replaced `sqrt` with `Pow` *in* a performance bottleneck, however, then you stand to gain by not doing it. – J... Mar 11 '14 at 19:58
  • 2
    Of course the bottlenecks are in somebody's code. My point is that the significant ones are more likely to be in the OP's code, not these functions. A profiler would be unlikely to say that the square root function was killing performance. – duffymo Mar 11 '14 at 20:03
  • 1
    @duffymo And what if you need to calculate millions of square roots a second? In my case Sqrt is taking up > 10% of the cpu time, and I haven't even started optimizing some other parts of my code yet so it will only increase... – Herman Apr 04 '17 at 11:39
  • 1
    Don't start a discussion in comments. Have a question? Post it. Square root isn't your problem; millions of square roots is the issue. Parallelizing the problem or changing it another way is the way to go. – duffymo Apr 04 '17 at 11:45
  • So sometime one can avoid to calculate the square root by just omitting it, e.g. for distance calculations which should fall below a threshold distance. – stephanmg Apr 16 '19 at 17:34
10

You would have to know something about how each function is implemented to answer the question.

The square root function uses Newton's method to iteratively calculate the square root. It converges quadratically. Nothing will speed that up.

The other functions, exp() and ln(x), have implementations that have their own convergence/complexity issues. For example, it's possible to implement both as series sums. A certain number of terms are required to maintain sufficient accuracy.

All bets are off if those functions happen to be implemented in native code. Those might be faster than anything you'll write.

Knowing those would let you make an informed decision. I would not take it on faith because those programmers "know" the answer.

Unless you're doing intensive numerical work, I'd say that the choice won't affect your overall program performance. It's micro-optimization that's best avoided, unless you're doing serious large-scale scientific programming.

duffymo
  • 305,152
  • 44
  • 369
  • 561
  • 2
    And if you are doing serious scientific programming, then the numerical stability of your algorithms and avoiding [catastrophic cancellation](http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html) may well be just as (or even more) important than the speed. – Dan Bryant Mar 11 '14 at 18:19
  • 3
    In this case `Sqrt` *is* implemented in hardware, both for x87 FPU (`fsqrt` - 32-bit) and SSE/SSE2(`sqrtss`, `sqrtps`;`sqrtsd`, `sqrtpd` - 64-bit). These are way faster than the software implementations for `Pow`, etc. Even on x87 hardware, which has implementations for `f2xm1`, `fyl2x`, etc, a single call to `f2xm1` (which will just get you started implementing `Pow`) is nearly as expensive as a full `fsqrt`, while `fyl2x` is over three times longer than `fsqrt`. There's no way that `Pow` can be faster barring very specific edge cases and clever algorithms. – J... Mar 11 '14 at 18:47