Pow() vs. exp() performance

Question

I was wondering if exp() is faster than more general pow(). I run fast benchmark on JsPerf http://jsperf.com/pow-vs-exp and it shown interesting results for me.

Math.exp(logBase * exponent);  // fastest
Math.exp(Math.log(base) * exponent);  // middle
Math.pow(base, exponent);  // slowest

I know that results will heavily vary on architecture and language but I am also interested in theoretical point of view. Is pow(a, b) implemented as exp(log(a) * b) or is there some more clever way how co compute power "directly" (in C++, C# or JavaScript). Are there CPU instructions for exp, log or pow on some architectures?

As far as I know, both exp() and log() are computed using some Taylor series and are pretty expensive to compute. This makes me believe that for constant base of power, this code

double logBase = log(123.456);
for (int i = 0; i < 1024; ++i) {
    exp(logBase * 654.321);
}

is better than this

for (int i = 0; i < 1024; ++i) {
    pow(123.456, 654.321);
}

Is that correct assumption?

I wouldn't be surprised if one of those options was significantly more accurate than the others. — , Jul 26 '13 at 21:53
I have margin of error around 2-5%. Try to run tests few times. But the benchmark is of course far from perfect. That's why I am interested in theory behind this. And also the precision is interesting question. — NightElfik, Jul 26 '13 at 22:04
This is really going to depend on implementation detail. Is your question about JavaScript specifically? — Rob Perkins, Jul 26 '13 at 22:04
Not necessarily JavaScript even though I came to it in JS. I would be interested also in something lower level like C/C++. — NightElfik, Jul 26 '13 at 22:09
My answer addresses the general, compiler independent case, but I still have to make implementation assumptions I can't verify. — Rob Perkins, Jul 26 '13 at 22:32
Tangentially, I was writing some code to read float values from binary files in Javascript and found the use of `Math.pow` to be the massive bottleneck. As this was powers of two, I simply precomputed the table and the code then flew... — Will, Jul 26 '13 at 22:53
If this is powers-of-two, you should be able to use the left-shift or right-shift operators to get really good results, if your language has them. Does JS have them? — Rob Perkins, Jul 26 '13 at 22:55

score 18 · Accepted Answer · edited May 23 '17 at 11:54

Yes, exp will be faster than pow in general.

The exp and log functions will be optimized for the target platform; many techniques can be used such as Pade approximation, linear or binary reduction followed by approximation, etc.

The pow function will generally be implemented as exp(log(a) * b) as you say, so it is obviously slower than exp alone. There are many special cases for pow such as negative exponents, integral exponents, exponents equal to 1/2 or 1/3, etc. These will slow down pow even further in the general case because these tests are expensive.

See this SO question on pow.

Good answer, especially for someone which avatar is the "e" letter. — tigrou, Apr 03 '20 at 19:56

score 5 · Answer 2 · answered Jul 26 '13 at 23:29

Regardless of the architecture details, Math.pow has to do more in terms of error checking (for example, what happens if the base is negative?). than Math.exp (and as such I'd expect pow to be slower).

Relevant parts of the spec:

http://ecma-international.org/ecma-262/5.1/#sec-15.8.2.8

15.8.2.8 exp (x)

Returns an implementation-dependent approximation to the exponential function of x (e raised to the power of x, where e is the base of the natural logarithms).

If x is NaN, the result is NaN. If x is +0, the result is 1. If x is −0, the result is 1. If x is +∞, the result is +∞. If x is −∞, the result is +0.

http://ecma-international.org/ecma-262/5.1/#sec-15.8.2.13

15.8.2.13 pow (x, y)

Returns an implementation-dependent approximation to the result of raising x to the power y.

If y is NaN, the result is NaN. If y is +0, the result is 1, even if x is NaN. If y is −0, the result is 1, even if x is NaN. If x is NaN and y is nonzero, the result is NaN. If abs(x)>1 and y is +∞, the result is +∞. If abs(x)>1 and y is −∞, the result is +0. If abs(x)==1 and y is +∞, the result is NaN. If abs(x)==1 and y is −∞, the result is NaN. If abs(x)<1 and y is +∞, the result is +0. If abs(x)<1 and y is −∞, the result is +∞. If x is +∞ and y>0, the result is +∞. If x is +∞ and y<0, the result is +0. If x is −∞ and y>0 and y is an odd integer, the result is −∞. If x is −∞ and y>0 and y is not an odd integer, the result is +∞. If x is −∞ and y<0 and y is an odd integer, the result is −0. If x is −∞ and y<0 and y is not an odd integer, the result is +0. If x is +0 and y>0, the result is +0. If x is +0 and y<0, the result is +∞. If x is −0 and y>0 and y is an odd integer, the result is −0. If x is −0 and y>0 and y is not an odd integer, the result is +0. If x is −0 and y<0 and y is an odd integer, the result is −∞. If x is −0 and y<0 and y is not an odd integer, the result is +∞. If x<0 and x is finite and y is finite and y is not an integer, the result is NaN.

Thanks, this is interesting! That's probably why `Math.exp(Math.log(base) * exponent)` is faster than `Math.pow(base, exponent)`. It might not compute all special cases correctly but I don't care about them anyway. — NightElfik, Jul 29 '13 at 21:39
@NightElfik regardless of the answer to the question, unless you are planning on doing heavy numeric work in JS, I doubt that the running time of `exp` and `pow` dominates the execution time of your script. Remember the knuth quote "premature optimization is the root of all evil" — SheetJS, Jul 29 '13 at 21:45
I know this very well. Even though this part of my code is heavily used, I was wandering more about the theory behind this phenomenon rather than actually trying to speedup my code. Before any performance optimizations I do profiling first. — NightElfik, Jul 29 '13 at 21:48

score 1 · Answer 3 · answered Jul 26 '13 at 22:45

1

As a partial answer, there are instructions for exp, log or pow on some architectures yes. However, that doesn't necessarily mean much.

For example, on x86 there's

f2xm1 which calculates 2^x - 1
fscale which calculates y * 2^(int)x
fyl2x which calculates y * log₂ x
fyl2xp1 which calculates y * log₂(x + 1) (has restrictions on input range)

However, they are not much used. It varies from architecture to architecture, but they're never fast. As a more extreme example, fyl2x has a latency of 724 on Sandy Bridge (pretty recent!), in that time on the same processor you could do about 700 independent floating point additions, or about 240 dependent floating point additions, or about 2000 independent simple integer operations.

That's about as bad as it gets, but they're typically slow. Slow enough that a manual implementation can beat them or at least not significantly lose.

Also, FPU code is slowly disappearing in favour of SSE code. There are no SSE equivalents of those instructions.

answered Jul 26 '13 at 22:45

harold

61,398
6
86
164

SSE embodies an FPU, and it has already largely replaced *x87* code, which is why Intel doesn't put effort into making `fyl2x` fast. – Potatoswatter Jul 27 '13 at 00:40
@Potatoswatter you seem to suggest that calling x87 code "FPU code" isn't proper, however, that is commonly done. It's a valid distinction, SSE does not use the old FPU. Of course it's implemented with the same functional units on the CPU, but logically they're completely separate. – harold Jul 27 '13 at 00:48
It might or might not be. Either way, SSE is an instruction set for a (vector) FPU, so calling x87 code "FPU code" to distinguish it from SSE is misleading or just confusing. More to the point, x87 is irrelevant because it's obsolete. – Potatoswatter Jul 27 '13 at 15:27
@Potatoswatter Yes obviously FPU also has that meaning, however Intel uses FPU to refer to the x87 environment (stack, control word, etc) and so I will use that terminology as well. – harold Jul 27 '13 at 16:10

Pow() vs. exp() performance

3 Answers3

Linked