1

I have the following equation in my C code

k * dl * (1.0 + pHold / centre
       + (pHold * pHold) / (2.0 * centre * centre)
       - square / (2.0 * centre))

I know that floating point divisions are a lot more expensive than multiplications, and I have been wrestling with this for a while. Is there any way to rearrange this to cut out a division?

Thanks

LihO
  • 41,190
  • 11
  • 99
  • 167
ethangk
  • 109
  • 1
  • 9
  • This is probably language-agnostic. More to do with math, than programming itself. Retagging – Lelouch Lamperouge Oct 13 '13 at 22:04
  • 2
    so is this a real bottleneck? or you are just doing premature optimization... –  Oct 13 '13 at 22:05
  • At some point, optimizations like this make it more unreadable and don't help. Trust the compiler to help you out on optimizing this. For example, most compilers will optimize `2 * x` to `x << 1` because it is cheaper. – BrainSteel Oct 13 '13 at 22:05
  • 2
    I would trust the compiler's CSE on this. The `(pHold/centre)` looks like an obvious candidate, and so does its square. the `2.0*square` term seems too marginal to me. – wildplasser Oct 13 '13 at 22:09
  • @ethangk I highly doubt this is the source of the problem. – arshajii Oct 13 '13 at 22:09
  • 3
    @wildplasser: The compiler can't do common sub-expression elimination inside floating-point formulas, unless you disable strict IEEE compliance. Floating-point multiplication isn't commutative. – Ben Voigt Oct 13 '13 at 22:22
  • I see. There are side effects involved and rounding/loss of precision. I really hate floating point ... – wildplasser Oct 13 '13 at 22:29
  • 4
    I hate that people always jump on optimizations and assume the OP doesn't know what he's doing. Perhaps you guys are right, but what about the next person looking for a solution to the same problem in a case when it actually *is* a problem? This stuff is relevant and helpful. – Ed S. Oct 14 '13 at 00:06
  • 1
    @BenVoigt Associative is what it is not. Multiplication is commutative because it is defined as “the result of the mathematical product, rounded to a nearby representable value according to the rounding mode”, a commutative definition. – Pascal Cuoq Oct 14 '13 at 07:20
  • @PascalCuoq: I think it's neither. It's definitely not distributive, since `A * (B - C)` is not the same as `A*B - A*C`. And it isn't associative, since `(A*B)*C` is not the same as `A*(B*C)`. Ok. I was looking at `A*B*C` vs `C*A*B` and thinking that violated commutativity, but it really would be `(A*B)*C` vs `C*(A*B)`, which are the same. – Ben Voigt Oct 14 '13 at 17:57

7 Answers7

7

Just note that before you actually try to optimize some part, you should:

  • make sure that it is correct
  • make sure that there is no way how to optimize this at the higher level
    ~ Isn't my program invoking this calculation more times than it is actually needed?
    ~ Could I use the previous results? (What is dynamic programming?)
  • once you know where is the bottleneck, bench-marking should follow:
    ~ It seems to be slow... How "slow" is it? ... How "fast" should it become?

But in case you are sure that the equation itself should be optimized, you could use the fact that the multiplicative inverse of centre appears in your equation 4 times, reducing the count of divisions to 1:

double centreInv = 1.0 / centre;
double pHoldToCentre = pHold * centreInv;
double result = 
    k * dl * (1.0 + pHoldToCentre 
              + 0.5 * pHoldToCentre * pHoldToCentre 
              - 0.5 * square * centreInv);

Also note that these kind of changes might actually affect the result of this equation, so if you decide to change it, make sure it still produces desired output.

Community
  • 1
  • 1
LihO
  • 41,190
  • 11
  • 99
  • 167
  • The code is all functioning correctly, it's just that these calculations are being run 100000s of times. I have profiled my code, and the majority of the time is spent on the function running these calculations. I'm using (and trusting) the compilers optimization (the gcc -O3 flag) for CSE, but it doesn't seem to be making any real improvements on the equations as a whole – ethangk Oct 13 '13 at 22:34
  • @ethangk: How about questions: *Isn't my program invoking this calculation more times than it is actually needed? Could I use the previous results?* ...Are you familiar with dynamic programming? – LihO Oct 13 '13 at 22:36
  • That is a very good point, and I have examined that but in this particular program, each iteration feeds of the last after running a whole bunch of operations on the block. There's a change that I dismissed it prematurely, I need to look into it a little more. Thanks! – ethangk Oct 13 '13 at 22:49
  • @EricPostpischil: Yes. Because it does far more than that. – LihO Oct 14 '13 at 00:07
  • @EricPostpischil: Are you sure that I am the one who mistakes the purpose of this community? OP described a specific problem, I pointed out other points to consider + different, yet perfectly suitable, approaches that might help him much more than "tuning" of his current approach + possible solution. This is community where people exchange experiences and knowledge and help each other while facing specific problems. Regarding the appropriateness of this answer: The mistake is yours. – LihO Oct 14 '13 at 00:31
  • @EricPostpischil: And the fact that over last 3.5 years of your usage of this website you've downvoted 1038 posts, which is almost 2 times more than the amount of upvotes you gave, clearly indicates that you prefer criticizing and complaining over constructive notes and collaboration, which might help to improve the content within this website otherwise. – LihO Oct 14 '13 at 00:41
  • @EricPostpischil: Your comment is factually incorrect. LihO's latest version (which is the version you commented on) does show "any way to rearrange this to cut out a division". The question has three, the answer has two. It's not the minimal number of divisions, but it is fewer operations than the other answers -- the exact tradeoff between multiplications and divisions is system-dependent. – Ben Voigt Oct 14 '13 at 18:01
  • @Eric: It answers the question. Therefore it should be an answer. It provides additional guidance that the question didn't directly ask for. That is appropriate in either a comment or as part of an answer. There is no violation. You, on the other hand, are violating several guidelines by posting off-topic comments concerning issues that ought to be raised on meta. – Ben Voigt Oct 14 '13 at 18:14
  • @Eric: The quoted text is **from the question**. Stop trolling. LihO is a knowledgeable and valuable contributor. – Ben Voigt Oct 14 '13 at 18:16
  • Additionally, OP has admitted that *dynamic programming*, which I've pointed out in 2nd paragraph might be very helpful in this situation indeed... and that's the main reason why we as well as millions of other professionals are using this site ~> to share, to help and to ask for help. – LihO Oct 14 '13 at 18:19
  • @BenVoigt: Thank you. (*"LihO is a knowledgeable and valuable contributor"*) – LihO Oct 14 '13 at 18:23
  • @Eric: The answers that eliminate more divisions fail to point out that the new expression rounds differently. That's not a trivial part of this answer, it is correct, and it deserves upvotes. StackOverflow is not a code-writing service; correct answers are not merely the correct code that solves the problem. [Good answers have explanations that cover the assumptions and limitations](http://stackoverflow.com/help/how-to-answer). If you disagree, let's go to meta. But I promise your opinion is shared with a tiny minority. – Ben Voigt Oct 14 '13 at 18:23
  • @EricPostpischil: I wrote what I wrote, because this is not the first time I saw you downvoting a good answer supported with unreasonable criticism. Comments should be constructive. If I see a good answer, yet I mind some detail, I'll try to help the author to improve it, not burn him on crossroad for it. – LihO Oct 14 '13 at 18:40
  • @Eric: Ahh, so you have a technical disagreement with LihO. I actually agree with you that avoiding divisions is not an optimization of last resort. But I certainly do not condone shrouding your disagreement in a claim that the answer violates guidelines or should be a comment, which is clearly false. Well-written answers are *answers*, regardless of whether or not the advice is good. You can downvote or leave a comment regarding technical merits of the advice. But DO NOT TRY to ban answers under moderation guidelines. – Ben Voigt Oct 14 '13 at 18:43
  • @EricPostpischil: I've edited my answer to apart the "good-for-comment" knowledge in more "good-for-answer" way. – LihO Oct 14 '13 at 18:58
  • @EricPostpischil: I'm pretty sure that successful memoization would in fact decrease the number of division instructions executed. I think we can all agree that the latest edit is an improvement. Which occurred when you chose, finally, to focus on the technical merits and provide constructive criticism instead of ridiculous comments that "should be a comment". – Ben Voigt Oct 14 '13 at 19:06
  • @Eric: Sometimes things aren't stated in the question. The potential benefit of memoization is high enough that it's worth looking for opportunities to apply it, even though it isn't applicable to every problem. The further comments suggest that it isn't. Future readers finding this question might have an algorithm where it is. Including it in a checklist of things to try when division is taking too much time is appropriate. – Ben Voigt Oct 14 '13 at 19:15
  • Good enough, I am deleting some of my old comments to clean up and eliminate distraction. – Eric Postpischil Oct 14 '13 at 20:46
4

If you look at the denominators for the fractions you can see that making a common denomination would allow you to do the division just once (at the expense of more multiplications):

k * dl * (1.0
  + pHold                  / (centre)
  - square                 / (2.0 * centre)
  + (pHold * pHold)        / (2.0 * centre * centre)
)

If you are sure that a floating point multiplication is better than a floating point division then:

k * dl * (1.0
  + (pHold * 2.0 * centre) / (2.0 * centre * centre)
  - (square * centre)      / (2.0 * centre * centre)
  + (pHold * pHold)        / (2.0 * centre * centre)
)

Which becomes:

k * dl * (1.0
  + ( (pHold * 2.0 * centre)
    - (square * centre)
    + (pHold * pHold) )     / (2.0 * centre * centre)
)
Jerry Jeremiah
  • 9,045
  • 2
  • 23
  • 32
1

Algebraically, you can reduce it to a single division. Using:

  • k for k
  • d for dl
  • p for pHold
  • c for centre
  • s for square

your equation is:

           p     p.p     s
k.d ( 1 + --- + ----- - --- )
           c    2.c.c   2.c

which transforms to:

k.d ( 2.c.c + 2.c.p + p.p - c.s )
---------------------------------
             2.c.c

and hence to

k.d (2.c (c + p) - c.s + p.p)
-----------------------------
            2.c.c

Or, in terms of your original variables:

(k * dl * (2 * centre * (centre + pHold) - centre * square + pHold * pHold)) /
                    (2 * centre * centre)

Whether that is as good numerically as the original equation is a separate discussion. To discuss that, we'd need to know the typical ranges for each of the terms in the equation (and even then, my brain would hurt).

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
0

You cn cut out at least one:

k * dl * (1.0 + (pHold
       + (pHold * pHold) / (2.0 * centre)
       - square * 0.5) / centre)
caf
  • 233,326
  • 40
  • 323
  • 462
0

In the old days, you probably would have written

oocenter = 1/center; 

and used it in the expression

k * dl * (1.0 + pHold * oocentre
       + pHold * pHold * 0.5 * oocentre * oocentre
       - square * 0.5 * oocentre)

Nowadays, I believe the compilers are smart enough to do that for you. I would suggest working towards vectorization and parallelization.

damienfrancois
  • 52,978
  • 9
  • 96
  • 110
  • 2
    A good compiler will not do this with default options because it changes the results. Floating-point arithmetic does not obey the same laws as exact mathematical arithmetic (e.g., rounding produces different results with different operation orders), so the compiler must respect the specific operations written by the programmer. – Eric Postpischil Oct 14 '13 at 00:04
  • Even with the -O3 option ? – damienfrancois Oct 14 '13 at 05:50
  • 2
    @damienfrancois Yes, even with `-O3`, because some people want compilers to generate the best assembly code that implements exactly what they have written in the source code. The global GCC option to allow the compiler this kind of transformation is `-ffast-math` and it is not enabled by any of the `-O` options. – Pascal Cuoq Oct 14 '13 at 07:24
  • Ok, thanks for clarifying. For the reference, the equivalent of -ffast-math for icc is -fp-model fast – damienfrancois Oct 14 '13 at 13:25
0

Hi I have no Idea of Programming C :)

But given k, dl, pHold, centre and square are all variables, you can simplify this mathematical equation to:

  k*dl*(2.0* centre * centre + 2.0 * centre * pHold - centre *square + pHold * pHold)
  /  (2.0 * centre * centre)

substitute variables to single character variables and use http://www.wolframalpha.com

edit: Nikos C has basically the same answer but is factoring out 2c. You can test/chose which one performs better.

ksu
  • 608
  • 4
  • 11
0

You can reduce this to only one division overall:

k * dl * (2 * centre * (centre + pHold) + pHold * pHold - centre * square)
/ (2.0 * centre * centre)
Nikos C.
  • 50,738
  • 9
  • 71
  • 96