5

Background

I'm an aerospace engineering and EECS student. I'm at the point where I work with a lot of math and physics, but haven't gotten into algorithms or assembly language yet.

I design and code a comically wide variety of programs, from business proposal software to satellite hardware controllers.

Most of this work involves doing math in some other medium, then writing code to implement it.

I algebraically simplify these equations before putting them into code. But before I expend the time to do so, I would like to know whether I should favor more addition operations, or more multiplication operations. (I already know division is much more costly.)


Example

B sub-x prime

This is an equation I've derived from some other work, and this is pretty typical of what I see.

We can plainly see that there are at least a handful of ways to simplify this equation. Since the simplification is at my discretion, I'd like to pick the option that favors performance as much as practical. I'm not going for bleeding-edge performance at the cost of algorithm design time.


The Question

In general, which double operation is faster: addition or multiplication?

I know the only definitive way to know which is faster is to write and run benchmarks, but that isn't the point here. This is not a high enough priority in what I do to justify writing test code every time I need to simplify an equation. What I need is a rule of thumb to apply to my algebra.

If the difference is so marginal as to border on negligible or inconclusive, that's an acceptable answer, so long as I know it makes nearly no difference.


Backing Research

I know that, in C and C++, the optimizer takes care of the algebra, so it's a null-issue. However, as I understand it, the Java compiler does not do algebraic simplification/optimization. Specifically, this answer indicates this to be the case, and that the programmer should do this sort of optimization.

There are scattered answers for this across the internet, but I can't come up with a conclusive answer. A former University of Maryland physics student ran these tests on Java, but the double performance data is absent in the tables, and the graph scales make the results indiscernible. This University of Quebec CS professor's tests only reveal results for integer operations. This SO answer explains that, on a hardware level, multiplication is a more complicated operation, but I'm also aware that engineers design processors with these sorts of things in mind.

Other marginally helpful links:

Community
  • 1
  • 1
drmuelr
  • 955
  • 1
  • 13
  • 30
  • 9
    The _compiler_ doesn't; the runtime does the optimization. (And frankly, if it's not high priority enough to write benchmarks for, it's not high priority enough to _care either way._) That said: addition's generally going to be faster. – Louis Wasserman Aug 15 '16 at 21:49
  • 1
    @LouisWasserman "not high priority enough to care either way." I agree with this, but if I need to work equations _anyways_, I would like to do so in a way that might yield better results. If I have two commuting routes that take an equal amount of time and fuel, but one is less likely to have an accident, I would prefer that route, even if there is no difference a majority of the time. – drmuelr Aug 15 '16 at 21:55
  • 1
    If you care about the difference, and no one has done the comparison already, then *you* run some benchmarks to do the comparison. If it is not important enough to spend the few hours to do that, then it's not important at all. Or did you expect us to do the work for you? – Andreas Aug 15 '16 at 22:10
  • 3
    You should be more concerned with the order of magnitude of each of the terms than with whether you can minimize the number of multiplies, and group the operations in ways that minimize the possibilities of overflow/underflow That said, there is really only one refactoring in your example that might be useful: there are 2 terms that multiply by (c-1)x. – FredK Aug 15 '16 at 22:11
  • 3
    My rule of thumb is to make the code as readable as possible, because it's far more important to be correct than it is to be fast. Rarely have I come across a situation in any language where I have to contort an algebraic expression in order to meet performance goals. My advice would be to write the expression in whatever form you're most comfortable with and that you will be able to easily read a year from now. If it ever comes to a point where you need to worry about performance, then you should profile and experiment. But what you're doing right now amounts to premature optimization. – Jim Mischel Aug 15 '16 at 22:13
  • @Andreas My knowledge of optimization and algorithms is very rudimentary. There are many people on this site who are much more knowledgeable about about the topic who can provide much more insight than a hacked-together test by yours-truly can yield. Furthermore, the purpose of this site is to ask good questions so others might find them later and benefit. – drmuelr Aug 15 '16 at 22:18
  • 1
    @Jim Right you are. early in my career I spent a whole week optimizing a large set of complex equations (on paper), then re-wrote the program based on the new way. The new way was 200 times faster than the old way, and got the same answer. Problem was, the old way took 0.8 seconds. So my 40 hours of work saved less than one second. – FredK Aug 15 '16 at 22:20
  • @LouisWasserman If I understand you right... is it correct to say that the JIT compiler will perform simplifications/optimizations like this? – drmuelr Aug 15 '16 at 22:23
  • Eh, probably? It's not clear to me that those transformations can be done in exactly semantics-preserving ways, since you get different rounding errors with different groupings, but yeah, what's possible there is its job to optimize. – Louis Wasserman Aug 15 '16 at 22:25
  • The SO answers and other resources you cite do not support your conclusion. One of them is about integer multiplication. In floating-point it is addition and subtraction that are slower. Another of them specifically contradicts your assertion that the programmer should do the optimization. The question about decimal vs floating-point is irrelevant. The others are inconclusive. The IBM article cited is 15 years old and predates HotSpot. – user207421 Aug 15 '16 at 23:37
  • @EJP Of course they don't support my conclusion: I don't have a conclusion. That's why I asked a question and provided links to resources that I tried before asking the question, as the number 2 subject in the [asking help center](http://stackoverflow.com/help/how-to-ask) recommends. – drmuelr Aug 15 '16 at 23:46
  • But they don't even say what you say they say. You've just misquoted them in most cases, or cited irrelevancies. To the extent that they are relevant and agree, they already answer your question. – user207421 Aug 15 '16 at 23:52
  • @EJP If I misquoted any of them, feel free to point them out/edit my question. Every single link is directly related to my question, but one-off in nearly every case. As you said yourself, the ones which don't discuss floating point are "inconclusive." Subsequently saying they agree is self-contradictory. – drmuelr Aug 16 '16 at 00:03
  • @EJP Correction to myself: the links aside from Decimal vs floating-point contradict each other. (Hence my confusion.) – drmuelr Aug 16 '16 at 00:10
  • 2
    Might as well throw in that the mix of instructions matters a lot, at least it does on modern Intel processors. See http://stackoverflow.com/questions/1146455/whats-the-relative-speed-of-floating-point-add-vs-floating-point-multiply for a brief discussion. Note also the link to Intel's optimization manual in the second answer. And, remember that the processor itself can re-order instructions. And the particulars change with new chip designs. You're going to spend a lot of time and likely realize only marginal performance gains. – Jim Mischel Aug 16 '16 at 00:19
  • (using b₁, b₂, b₃ instead of subscripting with x, y, z: s(b₂x-b₃y) + (c-1)b₁x² + (c-1)(b₂y+b₃z)x) - cb₁ + b₁ = s(b₂x-b₃y) + (c-1)((b₁x+b₂y+b₃z)x - b₁) ) – greybeard Aug 16 '16 at 06:14
  • 2
    Side note: always keep in mind that arithmetically equivalent expressions may have quite different numerical stability. Of course, this depends a it on the application, but optimizing for correctness (reducing error propagation) may be much more important than performance. – Hulk Aug 16 '16 at 09:59

1 Answers1

2

In general, you should write the code which is clearest. The JIT (not the javac) takes simple, common patterns and optimises them. This means using simple, common patterns is often the best way to optimise code.

If you profile your application and find that code is not running optimially you can try optimising the code yourself however;

  • meaningful micro benchmarks are difficult to write.
  • the results can be highly sensitive to the environment. Change the update of Java or the CPU model and you can get conflicting result.
  • when you performance test the whole code you are likely to find the delays are not when you expect them to be. e.g, they are often in IO.

Unless you are confident an optimisation really helps, you should stick with code which is simplest and easiest to maintain and you are likely to find it run fast enough.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130