1

the questionable Code is the following:

float32_t f = someValueFromSomewhere;
f = f * 4;

will compiler optimize this? According to C-Standard (if i understood it correctly) the second operand has to be promoted to float32_t; So the multiplication has to be done using the FPU (or fp emulation).

Theoretically the operation could be done in a normal hardware register by just adding an immediate (and may be overflow checking ). Are compiler allowed to do this optimization? Are there compiler known to do so? And if so, would they also recognize the expression

f = f * 4.0f;

that is required to avoid static code checker warnings about implicit conversions?

Some addition: I know that from the standards point of view both lines are equivalent. But clearly the compiler can distinguish them. So the question is at which time the optimizer is allowed to see the code (or better its internal representation) the first time.

vlad_tepesch
  • 6,681
  • 1
  • 38
  • 80
  • The promotion to `float` is done during compile time. So the expressions `f = f*4` and `f = f*4.0f` are equivalent. How they are compiled into assembly afterwards depends on the compiler at hand, but they are still compiled in the same way (per compiler). – barak manos Jan 29 '15 at 12:21
  • 2
    I'll misunderstand your question deliberately, but hopefully in an illuminating way. *The compiler is allowed to optimize whatever it likes as long as the observable effects are identical.* The code snippet you show may be optimized away without leaving a trace if the value is never used anywhere. If it is used the compiler may do whatever it likes as long as the effect is the same. I'd think that it can compute f at compile time and replace it with a constant if someValueFromSomewhere is known at compile time and it can statically prove that it'll never change after these lines, for example. – Peter - Reinstate Monica Jan 29 '15 at 12:27
  • 1
    related: [Why doesn't a compiler optimize floating-point *2 into an exponent increment?](http://stackoverflow.com/questions/12919184/why-doesnt-a-compiler-optimize-floating-point-2-into-an-exponent-increment) – phuclv Jan 29 '15 at 12:44
  • What's there to optimize in `LOAD REGX f; LOAD REGY 4; FMUL REGX, REGX, REGY` (reg x = reg x * reg y)? – Jens Jan 29 '15 at 13:15
  • @Jens On an x86/x87 nothing. but may be enlarge your horizon by try pushing this simple line through an avr-gcc and look at the complete output – vlad_tepesch Jan 29 '15 at 13:40

1 Answers1

6

No, it doesn't work

Adding 2 to the exponent in lieu of multiplying by 4.0 only works if the original value is not a subnormal value (including the common zero), is not infinite or NaN, and if the result does not overflow. In general, the compiler does not have this information. When the compiler does have this information, it is allowed to do this transformation, which does not mean that it is a good idea.

On most architectures, it is not a good idea

Unless you are thinking of a specific execution platform where it would be cheaper to get the value of f out of the floating-point register it's in, move it into a general-purpose register, add a constant, test special cases (see above), and go back to a floating-point register, you can assume that all these steps are way, way more expensive than a floating-point multiplication. Only if floating-point operations were emulated as series of bit and integer operations would it make sense to “optimize” multiplications by powers of two this way.

> would [compilers] also recognize the expression f = f * 4.0f;

f * 4 is equivalent to f * (float)4 and thus equivalent to f * 4.0f. The compiler can transform any of these forms into the same code it would have transformed the other, and any non-toy compiler knows that these are equivalent (for instance as an application of the constant propagation optimization pass to (float)4).

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • 1
    Interesting OP did not ask about `f * 4.0`. – chux - Reinstate Monica Jan 29 '15 at 14:52
  • Normally some value would be loaded from memory. I do not know if they can be directly be transferred to FPU-registers but i think transferring to CPU-Regs or FPU-Regs would not make a difference (at least not in a negative sense for the CPU). I am aware of the promotion to `float32_t` but may be some optimizations can take place before that steps. thats why the explicit separate question since the mentioned optimization seems more obvious if dealing with a `int` constant rather than a `float` constant – vlad_tepesch Jan 29 '15 at 15:10
  • @chux why do you think it would make any difference? the idea/theory behind is the same. – vlad_tepesch Jan 29 '15 at 15:11
  • @vlad_tepesch The optimization is more obvious with `4.0f`. Using an `int` only forces the compiler to insert a conversion from `int` to `float` (because this is how “usual arithmetic conversions” are specified in the standard) and then realize that the resulting expression is still constant and known at compile-time. – Pascal Cuoq Jan 29 '15 at 15:13
  • @PascalCuoq no resulting expression is known at compile time!.that is that i wanted to express with `someValueFromSomewhere`. Its not called `someConstant`. – vlad_tepesch Jan 29 '15 at 15:15
  • 2
    @vlad_tepesch Similarly, in the case of chux's remark, writing `4.0` forces the compiler to treat the assignment as equivalent to `f = (float)((double)f * 4.0);`, again because this is how “usual arithmetic conversions” are specified in the standard), and to then realize that this is equivalent. It is equivalent for subtle reasons that a compiler may not implement. – Pascal Cuoq Jan 29 '15 at 15:15
  • 2
    @vlad_tepesch The “expression” `(float) 4` is known at compile-time. The compiler **has** to deal with such an “expression” **because this is how the language is defined in general**. The first rule in implementing a compiler is to get all cases correct, so any reasonable compiler will start by inserting a conversion from `int` to `float`. – Pascal Cuoq Jan 29 '15 at 15:17
  • my thoughts were the following: the compiler builds up its AST. at this time there is a branch like "multiply( localsymbol(float), constant(int))". at this time the optimizer may come and say. "Oh look a float multiplied with 2^n - lets do hacky bit manipulation". The other way may be that before the optimizer the Promoter converts that branch to "multiply( localsymbol(float), constant(float))". then the optimizer may say: "Oh look a float multiplied with a float constant - nahh - have to call _fmult_emu anyways". :-) – vlad_tepesch Jan 29 '15 at 15:23
  • 3
    If I were going to implement this optimization, I would base it on the floating point constant form. Reducing the number of distinct cases simplifies optimization. The floating point case is more general - consider multiplication by 0.25 - and conversion to floating point has to be done anyway. – Patricia Shanahan Jan 29 '15 at 17:02