In some inner loop I have:
double x;
...
int i = x/h_;
double xx = x - i*h_;
Thinking that might be a better way to do this, I tried with std::remquo
double x;
...
int i;
double xx = std::remquo(x, h_, &i);
Suddenly, timings went from 2.6 seconds to 40 seconds (for many executions of the loop).
The timing test is difficult to replicate here, but I did a online code to see if someone can help me to understand what is going on.
naive version: https://godbolt.org/z/PnsfR8
remquo version: https://godbolt.org/z/NSMwyW
It looks like the main difference is that remquo is not inlined and the naive code is. If that is the case, what is the purpose of remquo
if it is going to be always slower than the manual code? Is it a matter of accuracy (e.g. for large argument) or not relying on (not well defined) casting conversion?
I just realized that the remquo version is not even doing something equivalent to the first code. So I am using it wrong. In any case, I am surprised that remquo
is so slow.