Godbolt shows you the assembly emitted by running the compiler with -S
. But in this case, that's not the code that actually gets run, because further optimizations can be done at link time.
Try checking the "Compile to binary" box instead (https://godbolt.org/z/ETznv9qP4), which will actually compile and link the binary and then disassemble it. We see that in your -DV=f
version, the code for f
is:
addss xmm0,xmm1
mulss xmm0,xmm2
ret
just as before. But with -DV=0
, we have:
movss xmm0,DWORD PTR [rip+0x2d88]
ret
So f
has been converted to a function which simply returns a constant loaded from memory. At link time, the compiler was able to see that f
was only ever called with a particular set of constant arguments, and so it could perform interprocedural constant propagation and have f
merely return the precomputed result.
Having an additional reference to f
evidently defeats this. Probably the compiler or linker sees that f
had its address taken, and didn't notice that nothing was ever done with the address. So it assumes that f
might be called elsewhere in the program, and therefore it has to emit code that would give the correct result for arbitrary arguments.
As to why the results are different: The precomputation is done strictly, evaluating both a*c
and b*c
as float
and then adding them. So its result of 122457232
is the "right" one by the rules of C, and it is also what you get when compiling with -O0
or -fp-model=strict
. The runtime version has been optimized to (a+b)*c
, which is actually more accurate because it avoids an extra rounding; it yields 122457224
, which is closer to the exact value of 122457225
.