5

Trying to port java code to C++ I've stumbled over some weird behaviour. I can't get double addition to work (even though compiler option /fp:strict which means "correct" floating point math is set in Visual Studio 2008).

double a = 0.4;
/* a: 0.40000000000000002, correct */

double b = 0.0 + 0.4;
/* b: 0.40000000596046448, incorrect
(0 + 0.4 is the same). It's not even close to correct. */

double c = 0;  
float f = 0.4f;  
c += f;
/* c: 0.40000000596046448 too */

In a different test project I set up it works fine (/fp:strict behaves according to IEEE754).

Using Visual Studio 2008 (standard) with No optimization and FP: strict.

Any ideas? Is it really truncating to floats? This project really needs same behaviour on both java and C++ side. I got all values by reading from debug window in VC++.

Solution: _fpreset(); // Barry Kelly's idea solved it. A library was setting the FP precision to low.

user141446
  • 63
  • 5
  • 2
    Can you post a small complete test program along with the exact command line used to compile (see output window etc.), which demonstrates the problem? The only way I can reproduce is by using 0.0f + 0.4f instead. – Barry Kelly Jul 20 '09 at 15:30
  • Can we assume you are aware of the imprecision of floating point types? Being accurate to 7 decimal places is usually considered ok since the default printing precision is 6. – Evan Teran Jul 20 '09 at 15:49
  • @Evan, his example is off by more than would be explained by floating point imprecision. – Kevin Jul 20 '09 at 15:55
  • Well, his example is off by exactly what would be explained by single-precision floating point precision. – Steve Jessop Jul 20 '09 at 15:58

2 Answers2

8

The only thing I can think of is perhaps you are linking against a library or DLL which has modified the CPU precision via the control word.

Have you tried calling _fpreset() from float.h before the problematic computation?

Barry Kelly
  • 41,404
  • 5
  • 117
  • 189
  • It must be something along those lines, but is 0.0 + 0.4 even a computation? Can't it be evaluated at compile time? Checking the disassembly might establish whether the runtime float mode has anything to do with it, or whether something has gone wrong at compile time. – Steve Jessop Jul 20 '09 at 16:17
  • Sure it can be, but if it were that simple, it would be easy to reproduce, no? – Barry Kelly Jul 20 '09 at 16:23
  • I dunno, maybe something else in the project is specifying /fp:stupid or equivalent. My personal favourite would be a source file isn't newline-terminated and therefore the program has undefined behaviour, although I don't hold out much hope of ever seeing that cause a bug in the wild... – Steve Jessop Jul 20 '09 at 16:44
  • Spot on! It was a library setting the FP precision to 32bit (a directx DLL in my case). Thanks! :) – user141446 Jul 21 '09 at 09:14
  • @barry-kelly do you know which functions can set the floating point precision? I suspect I have the same issue, but don’t want to put _fpreset all over the place, but rather right after the offending library... – Luc Bloom Nov 15 '17 at 17:51
  • _set_controlfp, _control87, _controlfp, __control87_2, maybe others. – Luc Bloom Nov 15 '17 at 17:59
3

Yes, it's certainly truncating to floats. I get the same value printing float f = 0.4 as you do in the "inaccurate" case. Try:

double b = 0.0 + (double) 0.4;

The question then is why it's truncating to floats. There's no excuse in the standard for treating 0.0 + 0.4 as a single-precision expression, since floating point literals are double-precision unless they have a suffix to say otherwise.

So something must be interfering with your settings, but I have no idea what.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699