double precision on linux using fpu_control.h

Question

i am trying to port a particular piece of code from solaris to Linux. During the process i found that the precision on linux is different and it is in extended precision and we need to set it to double precision explicitly. To achieve this found fpu_control.h library, functions FPU_GETCW and FPU_SETCW functions. But even after that the precision is not being set properly. the code snippet

long double power = 1.0;
#ifdef __linux
    fpu_control_t mask;
        _FPU_GETCW(mask);
mask &= ~(_FPU_EXTENDED & _FPU_SINGLE);
mask |= _FPU_DOUBLE;
        _FPU_SETCW(mask);    

   power *= 0.1;
#endif

when i print power the value is power = 0.1000000000000000055511151231257827

however I was expecting power to have an value 0.1 Also i have use -DDouble while compiling. Can someone point me whats going wrong.

I don't understand this. If you want `power` to be a double precision variable, why are you declaring it as `long double`? I assume you're aware that 0.1 can't be represented precisely as a floating point value regardless of the level of precision. — r3mainer, Jul 14 '17 at 10:18
you are right, but this multiplication happens inside a loop, based on certain conditions. I.e. sometimes we might end up doing this multiplications 5 times or 3 times. so after each multiplication the value would change, ie. .1,.01.001, .0001 and hence forth based on external factors. hence the code — girish s, Jul 14 '17 at 10:24
@girishs The number you got is the closest possible double precision binary floating point to decimal 0.1. What's the problem? It's as exact as it can get. — Art, Jul 14 '17 at 12:27
@art but when i print the value, its a huge number also, if i am using this number to do some arithmetic operation later, i will get a different number right? my expectation was i get the same value as i get on Solaris. (Sorry i may be missing something here) — girish s, Jul 14 '17 at 13:11
also i would like to get the same value as solaris, is there any way i can get it? — girish s, Jul 14 '17 at 13:11
@girishs You should show the code in question as a [MCVE](http://stackoverflow.com/help/mvce) along with the input, expected output, and actual output. — dbush, Jul 14 '17 at 13:33

score 0 · Answer 1 · answered Jul 14 '17 at 14:08

0

I was expecting power to have an value 0.1

Not generally possible to fulfill OP's expectation.

double and long double cannot store every possible number.
double can encode exactly about 2⁶⁴ different numbers as it is usually uses 64 bits.
long double can encode exactly maybe 2⁶⁴, 2⁸⁰ or 2¹²⁸ different numbers.

With typical double, 0.1 cannot be encoded exactly as a double. It is not one of those 2⁶⁴ exact numbers. Instead double x = 0.1 will initialize x with the closest alternative:

Exact value        0.1000000000000000055511151231257827021181583404541015625
OP's printed value 0.1000000000000000055511151231257827

The next close alternative is

0.09999999999999999167332731531132594682276248931884765625

This is not a double vs long double issue.

answered Jul 14 '17 at 14:08

chux - Reinstate Monica

143,097
13
135
256

ok, just for my understanding. what is the use of __SETFPUCW? i thought that would set the precision from extended (in linux, also as you pointed out) to double? would that not do the trick? – girish s Jul 17 '17 at 05:24
@girishs Use of [__setfpucw](http://man7.org/linux/man-pages/man3/__setfpucw.3.html). It will not do the trick. As this answer asserts that it is not a `double` vs `long double` issue. Changing precision does not help. It may make one value "work", but then others will fail. The solution is to "we need to set it to double precision explicitly." is that you are all ready getting `double` precision. IOWs, 0.1000000000000000055511151231257827 is not wrong. That is what you should get with `double` precision. – chux - Reinstate Monica Jul 17 '17 at 14:04

score -1 · Answer 2 · answered Jul 14 '17 at 13:29

You specifically request a long double, while you supposedly want plain double. If your hardware is an Intel x86/x86-64 CPU, calculations going through the FPU are performed on 80-bits precision.

Otherwise: try using something like the gcc flag: -mfpmath=sse, which will stop using the FPU and your operations will be performed with 64-bit (aka double) precision.

Note:

It is very possible that even in Solaris you were getting an inexact representation for 0.1 (there isn't an exact one), but the way the value was output hid this inexactness by printing up to a specified number of decimal digits.

On 32bit the default is usually x87 with a 80 bit representation. On amd64 usually the default is SSE. — Paul Floyd, Jul 16 '17 at 12:07

double precision on linux using fpu_control.h

2 Answers2

Note: