I have a numerical code that solves an equation f(x) = 0
, in which I have to raise x
to a power p
. I solve it using a bunch of things, but in the end I have Newton's method. The solution happens to be equal to x = 1
, and thus is the cause of my problems. When the iterated solution gets close to 1
, say x = 1 + 1e-13
, the time necessary to compute std::pow(x, p)
grows tremendously, easily by a factor of 100, making my code unusable.
The machine running this thing is AMD64 (Opteron 6172) on CentOS, and the command is simple y = std::pow(x, p);
. Similar behavior shows up on all my machines, all x64. As documented here, this is not only my problem (i.e., someone else is pissed too), appears only on x64 and only for x
close to 1.0
. Similar thing happens to exp
.
Solving this problem is vital to me. Does anyone know if there is any way of going around this slowness?
EDIT: John pointed out that this is due to denormals. The question is then, how to fix this? The code is C++, compiled with g++
for use within GNU Octave
. It appears that, although I have set CXXFLAGS
to include -mtune=native
and -ffast-math
, that is not helping and code runs just as slowly.
PSEUDO-SOLUTION FOR NOW: To all who care about this problem, the solutions suggested below didn't work for me personally. I really need the usual speed of std::pow()
, but without the sluggishness around x = 1
. The solution for me personally is to use the following hack:
inline double mpow(double x, double p) __attribute__ ((const));
inline double mpow(double x, double p)
{
double y(x - 1.0);
return (std::abs(y) > 1e-4) ? (std::pow(x, p)) : (1.0 + p * y * (1.0 + (p - 1.0) * y * (0.5 + (1.0 / 6.0) * (p - 2.0) * y)));
}
The bound could be changed, but for -40 < p < 40 the error is smaller than about 1e-11, which is good enough. Overhead is minimal from what I found, thus that solves the issue for me.