4

I am implementing a Runge–Kutta procedure, which includes several time-critical multiplications with fixed, complicated fractions (which are not magic numbers but inherent to the algorithm) and I want this multiplications to be performed as efficient as possible whilst keeping the code readable.

For simplicity’s sake, let’s assume my code would look like the following, if I did not need to care about efficiency:

for (int i = 0; i < n; i++)
    a[i] += f(i) + b[i] * (2197/4104.);

Can I assume that every reasonable compiler (with optimisation) will effectively replace 2197/4104 by 0.535331…? If not, what are good ways to ensure this? Would defining a const double suffice, for example?

(Note that I am not interested in other possibilities for optimisation of the above code – it’s really just an example.)

Wrzlprmft
  • 4,234
  • 1
  • 28
  • 54
  • Yes. And if it doesn't for whatever reasons, just make a `const float` (or `const double`) that stores the value and use that instead. – OMGtechy Mar 27 '14 at 13:14
  • 1
    Note: If you're already concerning such small optimizations, you have already profiled your program and found this to be the bottle neck, did you? Otherwise, don't bother. Compilers are usually smart enough. – Zeta Mar 27 '14 at 13:20
  • @Zeta: It may not be the bottle neck but sufficiently relevant in some cases. Anyway, not only does my compiler have to be smart enough, but (almost) everybody’s. – Wrzlprmft Mar 27 '14 at 13:27

2 Answers2

4

Using any recent compiler, the evaluation will be done at compile time.

However, if you cannot guarantee that the compiler will, simply lift the calculation out of the loop (making a const long double if possible):

long double fraction = (2197/4104.);
for (int i = 0; i < n; i++)
  a[i] += f(i) + b[i] * fraction;

If accuracy of summation is important and the size of f(i) or b[i] is potantially large (I assume it could be), you're better off not using += to sum the values, instead look at the Kahan summation algorithm to sum, with minimal loss of precision. Alternatively, try to work with integral types while summing and then performing the division as a final step.

Rich O'Kelly
  • 41,274
  • 9
  • 83
  • 114
  • 1
    Even though this used to be the norm, I do expect that modern compilers are able to perform this optimization automagically. – Mihai Todor Mar 27 '14 at 13:16
  • 1
    This _may_ actually degrade the performance. Not to say that it may produce different results. – ach Mar 27 '14 at 13:18
  • 1
    Yes, I agree. However, the part in the question specifically asking `what is a good way to ensure this`. However, I take your point and have updated my answer. – Rich O'Kelly Mar 27 '14 at 13:19
  • 1
    It reminds me of a recent question about why OpenGL source files repeatedly use a literal constant for 1/255. Can't find it though. – ach Mar 27 '14 at 13:27
  • 1
    @AndreyChernyakhovskiy that was a [great question](http://stackoverflow.com/questions/22621241/what-does-the-constant-0-0039215689-represent/22621361#22621361) easier to find if you remembered that [Mysticial](http://stackoverflow.com/users/922184/mysticial) provided the most excellent answer to that question. – Shafik Yaghmour Mar 27 '14 at 13:35
  • @AndreyChernyakhovskiy: In what cases may this degrade the performance? If the compiler does not optimise it? – Wrzlprmft Mar 27 '14 at 14:15
  • @rich.okelly: Why do you use `long double` instead of `const long double` in your example code? – Wrzlprmft Mar 27 '14 at 14:16
  • @Wrzlprmft just in case those numbers aren't actually constant in the OPs code (I believe they've posted an equivalent snippet), I have however included in my answer that it should be a const if possible. – Rich O'Kelly Mar 27 '14 at 15:18
  • @Wrzlprmft, well, I think it was a far-fetched supposition on my part. In small code snippet like this it is unlikely that performance should degrade. Anyway, only profiling may give the ultimate answer. – ach Mar 27 '14 at 15:31
0

In order to skip the floating point arithmetics you always can multiply your numbers by 100000 (I just looked at the Runge-Kutta on wiki, that's from where the number is) and do the entire algorithm on integers and when you actually need the result divide again with 100000. You use 2197/4104 as 53533, and you calculate for the others (28561/56430 = 0.50435 -> 50435, etc ...) too.

Ferenc Deak
  • 34,348
  • 17
  • 99
  • 167
  • Look up "Fixed Point Math". A better denominator is one that is a power of 2 such as 65536. This will allow right shifting which is a lot faster than division. – Thomas Matthews Mar 27 '14 at 15:56