The pow() function is typically implemented in the math library, possibly using special instructions in the target processor, for x86 see How to: pow(real, real) in x86. However, instructions such as fyl2x
and f2xm1
aren't fast, so the whole thing could take 100 CPU cycles. For performance reasons a compiler like gcc provide "built-in" functions that provide strength reduction to perform computations faster in special cases. When the power N
is an integer (as in your case) and small (as in your case) then it is faster to multiply N
times than to call the library function.
In order to detect cases where the power is an integer the math library provides overloaded functions, for example double pow(double,int)
. You will find that gcc converts
double x = std::pow(y,4);
internally into 2 multiplications, which is much faster than a library call, and gives the precise integer result you expect when both operands are integers
double tmp = y * y;
double x = tmp * tmp;
in order to get this type of strength reduction you should
include < cmath >
- compile with optimization -O2
- call the pow function in the library explicitly
std::pow()
to make sure that's the version you get, and not one from math.h
You will then match the overloaded pow function in < cmath > which looks like this
inline double pow(double __x, int __i) { return __builtin_powi(__x, __i); }
Notice that this function is implemented with __builtin_powi
which knows the strength reduction of pow() to multiplication when the power is a small integer.