Why is the pow function slower than simple operations?

Question

From a friend of mine, I heard that the pow function is slower than its equivalent in simply multiplying the base by itself, the amount of times as its exponent. For example, according to him,

#include <stdio.h>
#include <math.h>

int main () {
    double e = 2.71828
    e2 = pow (e, 2.0)
    printf("%le", e2)
}

is slower than

#include <stdio.h>

int main() {
    double e = 2.71828
    e2 = e * e
    printf("%le", e2)
}

As a novice, I would think they both compile at the same speed, and by the same logic, I would prefer the former for its typical pithiness. So, why is the former block of code slower than the latter one?

Because multiplying a `double` is cheap, doing what `pow` does is not (calculate roots for decimal powers). — Richard J. Ross III, Jun 27 '13 at 19:18
Also, 'compiling at the same speed' != 'running at the same speed'. Compilation is a *very* separate process from actually running the code. — Richard J. Ross III, Jun 27 '13 at 19:20
@RichardJ.RossIII: I think that by "decimal" you actually mean "fractional"; for example, `pow(1.2, 3.4)`, which can't be done by a simple series of multiplications. — Keith Thompson, Jun 27 '13 at 19:31

score 5 · Accepted Answer · answered Jun 27 '13 at 19:19

5

pow(double,double) needs to handle raising to any power, not just an integer based power, or especially 2. As such, it's far more complicated than just doing a simple multiplication of two double values.

answered Jun 27 '13 at 19:19

Reed Copsey

554,122
78
1,158
1,373

Matteo Italia · Answer 2 · 2013-06-27T19:45:02.190

4

Because the pow function must implement a more generic algorithm that has to work on all the cases (in particular, it must be able to elevate to any rational exponent representable by a double), while e*e is just a simple multiplication that will boil down to one or two assembly instructions.

Still, if the compiler is smart enough, it may automatically replace your pow(e, 2.0) with e*e automatically anyway (well, actually in your case it will probably just perform the whole computation at compile time).

Just for fun, I ran some tests: compiling the following code

#include <math.h>

double pow2(double value)
{
    return pow(value, 2.);
}

double knownpow2()
{
    double e=2.71828;
    return pow(e, 2.);
}

double valuexvalue(double value)
{
    return value*value;
}

double knownvaluexvalue()
{
    double e=2.71828;
    return e*e;
}

with g++ -O3 -c pow.c (g++ 4.7.3) and disassembling the output with objdump -d -M intel pow.o I get:

0000000000000000 <_Z4pow2d>:
   0:   f2 0f 59 c0             mulsd  xmm0,xmm0
   4:   c3                      ret    
   5:   66 66 2e 0f 1f 84 00    data32 nop WORD PTR cs:[rax+rax*1+0x0]
   c:   00 00 00 00 

0000000000000010 <_Z9knownpow2v>:
  10:   f2 0f 10 05 00 00 00    movsd  xmm0,QWORD PTR [rip+0x0]        # 18 <_Z9knownpow2v+0x8>
  17:   00 
  18:   c3                      ret    
  19:   0f 1f 80 00 00 00 00    nop    DWORD PTR [rax+0x0]

0000000000000020 <_Z11valuexvalued>:
  20:   f2 0f 59 c0             mulsd  xmm0,xmm0
  24:   c3                      ret    
  25:   66 66 2e 0f 1f 84 00    data32 nop WORD PTR cs:[rax+rax*1+0x0]
  2c:   00 00 00 00 

0000000000000030 <_Z16knownvaluexvaluev>:
  30:   f2 0f 10 05 00 00 00    movsd  xmm0,QWORD PTR [rip+0x0]        # 38 <_Z16knownvaluexvaluev+0x8>
  37:   00 
  38:   c3                      ret

So, where the compiler already knew all the values involved it just performed the computation at compile-time; and for both pow2 and valuexvalue it emitted a single mulsd xmm0,xmm0 (i.e. in both cases it boils down to the multiplication of the value with itself in a single assembly instruction).

edited Jun 27 '13 at 19:45

answered Jun 27 '13 at 19:19

Matteo Italia

123,740
17
206
299

Actually, I'm not entirely certain that the compiler could inline something like that, especially if you are dynamically linking to that function. It could have results in certain edge-cases which are different than a simple multiply. – Richard J. Ross III Jun 27 '13 at 19:20
@RichardJ.RossIII: s/will/may/ :) – Matteo Italia Jun 27 '13 at 19:21
1

@RichardJ.RossIII pow() is part of the standard library (C99 7.12.7). The compiler is allowed to assume that you are using **its** pow() function. Using another function with the name `pow` is undefined behavior. – Pascal Cuoq Jun 27 '13 at 19:22
@PascalCuoq no, you can most definitely have & compile 'standard' C without linking to the standard library - quite common on embedded systems, in fact. – Richard J. Ross III Jun 27 '13 at 19:23
1

@RichardJ.RossIII The compiler is free to give you options to tell it that you are not linking with its standard library. It does not have to give you these options to be standard-compliant, and they do not have to be on by default. Ever looked at the assembly generated for calls to strlen(), memcpy(), … on known arguments? GCC devs even thought they could substitute calls to pow() by calls to sqrt(), but they got it slightly wrong: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=43419 – Pascal Cuoq Jun 27 '13 at 19:26
@RichardJ.RossIII: also, I suppose that, if you include ``, the compiler is definitely allowed to suppose that the `pow` you are referring to is the "standard" `pow`, so it can activate all kinds of optimizations and compiler intrinsics for it. – Matteo Italia Jun 27 '13 at 19:31
1

"in particular, it must be able to elevate to any *rational* power" -- Not really. Not all rational numbers are representable. `pow(2.0, 1.0/3.0)` doesn't *quite compute the cube root of 2; rather, it computes a close approximation of 1/3, then computes a close approximation of 2 raised to that power. – Keith Thompson Jun 27 '13 at 19:34
@KeithThompson: %s/rational power/rational power representable by the `double` data type with the correct approximations according to the implementation-specific floating point data type, which usually is IEEE 754 also known as ISO/IEC/IEEE 60559:2011 and blah blah blah but maybe we all knew that a `sizeof(double) – Matteo Italia Jun 27 '13 at 19:38
%s/rational/representable/g would do. 8-)} – Keith Thompson Jun 27 '13 at 19:43
@KeithThompson: fair enough :) – Matteo Italia Jun 27 '13 at 19:45
@RichardJ.RossIII: If `x` is a number, x**2 mathematically equals x•x, so `x*x` may be substituted for `pow(x, 2)` and have identical effects under IEEE 754. If x is a NaN, the C standard is not specific about the results, but substituting `x*x` will produce the results one would normally expect for `pow(x, 2)`, in regard to the NaN result and to exceptions. – Eric Postpischil Jun 27 '13 at 19:50
I think 2 is probably the only exponent that the compiler could optimize this way. Starting with 3, `pow(x,3)` could be more accurate than `x*x*x` since the latter rounds twice. – R.. GitHub STOP HELPING ICE Jun 27 '13 at 19:54
@R..: yes, and that's confirmed by trying my code above with exponent 3: the `pow`-version of the function actually calls pow (although the functions where both arguments are known are still computed at compile-time). – Matteo Italia Jun 27 '13 at 20:01

user2246674 · Answer 3 · 2013-06-27T19:26:49.833

0

Here is one (simple, heed the comment) pow implementation. In being generic it involves a number of branches a potential division and calls to exp, log, modf ..

On the other hand, on the multiplication is a single instruction (give or take) on most higher CPUs.

edited Jun 27 '13 at 19:26

answered Jun 27 '13 at 19:20

user2246674

7,621
25
28

Why is the pow function slower than simple operations?

3 Answers3

Linked