Faster way to multiply floats

Question

We can do left shift operators in C/C++ for faster way to multiply integers with powers of 2.

But we cannot use left shift operators for floats or doubles because they are represented in different way, having an exponent component and a mantissa component.

My questions is that,

Is there any way? Like left shift operators for integers to faster multiply float numbers? Even with powers of 2??

Floats have two elements (mantissa and exponent) which *are* powers of 2. What are you asking? — S.Lott, Oct 13 '09 at 19:23
don't try to outsmart your compiler without profound reasons — Christoph, Oct 13 '09 at 19:23
There are many pdfs explaining fast floating point product algorithms in google — Tom, Oct 13 '09 at 19:25
Christoph: That reminds me of a quote... I can't remember who said it... (paraphrased) "Always remember these two rules: 1. Don't second guess the compiler unless you know more than it does. 2. The compiler always knows more than you do." — Powerlord, Oct 13 '09 at 19:34
The compiler choses how to implement multiplication. The fact that *you* can do it faster by shifting is an old, old myth. — nos, Oct 13 '09 at 19:50
It's true that most compilers have recognized multiplication of integers by statically defined powers of two, and turned them into shifts (if helpful) for quite a while. That doesn't apply to floating point though. Having written a few compilers, and examined the output from quite a few more, I feel quite safe in stating categorically that know more than any compiler I've seen yet. Contrary to popular belief, compilers do NOT seem to be improving in this respect either -- the best FP optimization I've seen was on mainframes, decades ago. — Jerry Coffin, Oct 13 '09 at 20:10
@Jerry - the lack of FP optimization in compilers is likely because it does not matter any more. How many software products are there that actually have floating point calculations as their bottleneck, in this day of multi-core, 3 GHz CPUs? — Jason Berkan, Oct 13 '09 at 21:53

score 13 · Answer 1 · 2009-10-13T19:31:45.250

No, you can't. But depending on your problem, you might be able to use SIMD instructions to perform one operation on several packed variables.. Read about the SSE2 instruction set. http://en.wikipedia.org/wiki/SSE2
http://softpixel.com/~cwright/programming/simd/sse2.php

In any event, if you are optimizing floating-point multiplications, you are in 99% of the cases looking in the wrong place. Without going on a major rant regarding premature optimization, at least justify it by performing proper profiling.

greyfade · Answer 2 · 2009-10-13T19:34:24.840

5

You could do this:

float f = 5.0;
int* i = (int*)&f;
*i += 0x00800000;

But then you have the overhead of moving the float out of the register, into memory, then back into a different register, only to be flushed back to memory ... about 15 or so cycles more than if you'd just done fmul. Of course, that's even assuming your system has IEEE floats at all.

Don't try to optimize this. You should look at the rest of your program to find algorithmic optimizations instead of trying to discover ways to microoptimize things like floats. It will only end in blood and tears.

edited Oct 13 '09 at 19:34

answered Oct 13 '09 at 19:28

greyfade

24,948
7
64
80

2

Gah.. that code gives me the willies. Also, passing data between the floating point registers CPU registers, is often a very costly operation. Which is why even float-to-int conversions suck. – Oct 13 '09 at 19:36

score 1 · Answer 3 · answered Oct 13 '09 at 19:32

1

Truly, any decent compiler would recognize static-time power-of-two constants and use the smartest operation.

answered Oct 13 '09 at 19:32

NewbiZ

2,395
2
26
40

I have to guess that you rarely (if ever) really examine the output of a compiler. I have -- nearly none of them is very smart, especially when it comes to floating point. Intel's does about as well as any I've seen recently, and I'd barely rate it as "mediocre" in this respect -- most of the others are substantially worse. – Jerry Coffin Oct 13 '09 at 20:11

Special Sauce · Answer 4 · 2015-12-30T06:36:46.590

In Microsoft Visual C++, don't forget the "floating point model" switch. The default is /fp:precise but you can change it to /fp:fast. The fast model trades some floating point accuracy for more speed. In some cases, the speedups can be drastic (the blog post referenced below notes speedups as high as x5 in some cases). Note that Xbox games are compiled with the /fp:fast switch by default.

I just switched from /fp:precise to /fp:fast on a math-heavy application of mine (with many float multiplications) and got an immediate 27% speedup with almost no loss in accuracy across my test suite.

Read the Microsoft blog post regarding the details of this switch here. It seems that the main reasons to not enable this would be if you need all the accuracy available (eg, games with large worlds, long-running simulations where errors may accumulate) or you need robust double or float NaN processing.

Lastly, also consider enabling the SSE2 instruction extensions. This gave an extra 3% boost in my application. The effects of this will vary depending on the number of operands in your arithmetic—for example, these extensions can provide speedup in cases where you are adding or multiplying more than 2 numbers together at a time.

Faster way to multiply floats

4 Answers4