38

I use the two following C++ compilers:

  • cl.exe : Microsoft (R) C/C++ Optimizing Compiler Version 19.00.24210 for x86
  • g++ : g++ (Ubuntu 5.2.1-22ubuntu2) 5.2.1 20151010

When using the built-in sine function, I get different results. This is not critical, but sometimes results are too significants for my use. Here is an example with a 'hard-coded' value:

printf("%f\n", sin(5451939907183506432.0));

Result with cl.exe:

0.528463

Result with g++:

0.522491

I know that g++'s result is more accurate and that I could use an additional library to get this same result, but that's not my point here. I would really understand what happens here: why is cl.exe that wrong?

Funny thing, if I apply a modulo of (2 * pi) on the param, then I get the same result than g++...

[EDIT] Just because my example looks crazy for some of you: this is a part of a pseudorandom number generator. It is not important to know if the result of the sine is accurate or not: we just need it to give some result.

Ripi2
  • 7,031
  • 1
  • 17
  • 33
Nicolas
  • 1,812
  • 3
  • 19
  • 43
  • 14
    A peek at the x86 instruction set locates the FSIN and FCOS math instructions, implemented in hardware, so you'd expect the results to be compiler-independent. Maybe it is known that FSIN/FCOS is inaccurate with large values, and gcc goes the extra mile, and manully applies the modulo before executing FSIN/FCOS. – Sam Varshavchik Oct 12 '17 at 13:55
  • Out of curiosity: why do you need such a large argument? – Henrik Oct 12 '17 at 14:02
  • @Henrik This is just a little piece of an algorithm: the previous instruction can generate this kind of huge values. – Nicolas Oct 12 '17 at 14:05
  • 14
    Keep in mine that `sin(x+2pi)` is equal to `sin(x)`. In practice, this means that the argument to `sin` gets pre-scaled in the `sin` function so that it is in the range [0..2pi). The larger the argument is, the less significance there are in the low bits. Extracting the low bits by pre-scaling loses precision as the argument gets larger. Essentially, when you try to calculate the sin of an argument that big, the result is nonsense. The function is trying to calculate the sin of noise. Garbage in, garbage out. – Pete Becker Oct 12 '17 at 14:07
  • 19
    What compiler switches are you passing in each case? Do you have optimizations enabled? The precision is controllable with options. The x87 FSIN and FCOS instructions are a red herring here. They aren't used by modern compilers, having been replaced by SSE2 instructions. With GCC, even with optimizations disabled, it computes this at *compile time* and spits out the result as a literal. MSVC won't. – Cody Gray - on strike Oct 12 '17 at 14:12
  • 2
    "this is a part of a pseudorandom number generator": as a side note, unless you're exploiting some obscure implementation detail ( but you wouldn't ask this then), using floating point for an RNG seems a bad idea to me ... – Massimiliano Janes Oct 12 '17 at 14:28
  • There may be different rounding schemes for your platforms. Please check [reference](http://en.cppreference.com/w/cpp/numeric/fenv/feround) – AndyG Oct 12 '17 at 14:33
  • 1
    @SamVarshavchik they are indeed [known to be inaccurate](https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/), but not commonly used anyway – harold Oct 12 '17 at 14:35
  • 9
    Using sin this way for random generator is a very bad idea. Don't do it. – geza Oct 12 '17 at 14:57
  • Thanks for all your 'don't do that' comments... But in this particular case, I have no choice. – Nicolas Oct 12 '17 at 15:27
  • 19
    On two occasions I have been asked, "Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?" ... I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. – Charles Babbage – Arne Vogel Oct 12 '17 at 15:41
  • @CodyGray to be fair, the only instruction set he mentions it x86, and doesn't make it clear that he's not compiling in 32-bit mode. The compiler will only use the SSE2 instructions for x86_64 (where they are guaranteed to exist) or if an architecture specific flag is provided (e.g. -msse2) – Steve Cox Oct 12 '17 at 16:59
  • 3
    You use the sine function in a random number generator? And you don't even care if the result is correct? Sound *very* suspect, like [the guy who thought that multiplying two random numbers makes the result "more random"](https://stackoverflow.com/q/3956478/695132). – Szabolcs Oct 12 '17 at 19:44
  • @Nicolas: can you please tell us the circumstances that you have no choice? I'm really curious about this, why a non-reliable sin is a solution here (even IEEE 754 doesn't mandate that a sin should return the most exact result. Not just for large numbers, but for any number). – geza Oct 12 '17 at 20:10
  • @geza I've seen a botnet with a domain generator algorithm ("DGA") written in 32-bit i386 assembler for win32 that used a trig-based RNG, and that also depended on "fast" IEEE math. Attempting to reverse engineer that DGA and reconstruct that code to make it work on other systems so that the DGA domains could be pre-registered was non-trivial. – Alnitak Oct 12 '17 at 21:49
  • @Alnitak: I see. Expect this is hard to 100% reproduce on other platforms (especially if they used i387). You can easily reach 99.9999%, but the remaining 0.0001% is hard: `sin` is not standardized by IEEE. – geza Oct 12 '17 at 22:13
  • 2
    That is not correct, @Steve. Modern versions of GCC and MSVC default to targeting SSE2 even for 32-bit builds. You have to explicitly turn this off with a compiler switch that forces them back to targeting the x87 without any extended instruction set support. In other words, `-msse2` and `/arch:SSE2` (respectively) are the default/implied switches for 32-bit builds. That's why I asked about compiler options at the very beginning. You'll still see x87 instructions used when they're appropriate, thanks to the calling convention, but FSIN/FCOS are not used. – Cody Gray - on strike Oct 12 '17 at 22:56
  • 1
    [`The result may have little or no significance if the magnitude of arg is large (until C++11)`](http://en.cppreference.com/w/cpp/numeric/math/sin) – phuclv Oct 13 '17 at 02:27
  • @CodyGray I just tested this with gcc 5.3.1 (not trunk, but fairly modern) `gcc -m32 -dM -E -x c /dev/null | grep SSE` gives no results while `gcc -dM -E -x c /dev/null | grep SSE` lists all the sse macros up to sse2. Building a simple test program with -m32 (main contents were `printf("%g\n", sin(atof(v[1]) + 1));`) showed usage of x87 instructions. Also, gcc has this line in their documentation regarding -mfpmath=387 "This is the default choice for non-Darwin x86-32 targets." (Darwin actually defaults turning everything up to SSE4.2 on). – Steve Cox Oct 13 '17 at 13:50

3 Answers3

36

You have a 19-digit literal, but double usually has 15-17 digit precision. As a result, you can get a small relative error (when converting to double), but big enough (in the context of sine calculation) absolute error.

Actually, different implementations of the standard library have differences in treating such large numbers. For example, in my environment, if we execute

std::cout << std::fixed << 5451939907183506432.0;

g++ result would be 5451939907183506432.000000
cl result would be 5451939907183506400.000000

The difference is because versions of cl earlier than 19 have a formatting algorithm that uses only a limited number of digits and fills the remaining decimal places with zero.

Furthermore, let's look at this code:

double a[1000];
for (int i = 0; i < 1000; ++i) {
    a[i] = sin(5451939907183506432.0);
}
double d = sin(5451939907183506432.0);
cout << a[500] << endl;
cout << d << endl; 

When executed with my x86 VC++ compiler the output is:

0.522491
0.528463

It appears that when filling the array sin is compiled to the call of __vdecl_sin2, and when there is a single operation, it is compiled to the call of __libm_sse2_sin_precise (with /fp:precise).

In my opinion, your number is too large for sin calculation to expect the same behavior from different compilers and to expect the correct behavior in general.

DAle
  • 8,990
  • 2
  • 26
  • 45
  • 4
    I would have thought that the compilers would produce the exact same bit pattern from interpreting this literal, which then wouldn't explain the difference in results. – SirGuy Oct 12 '17 at 14:01
  • 2
    @SirGuy Probably this is unspecified by the standard, and works subject to current rounding settings and compiler flags. – lisyarus Oct 12 '17 at 14:54
  • @SirGuy, Yes, that's true. They have same binary value. – DAle Oct 12 '17 at 15:06
  • 3
    This is because libstdc++ tries to display the value as it is represented (i.e. it will be ``n.000000`` where n is rounded to some power of 2). MSVCRT, on the other hand, will just display zeroes at the end. The GNU implementation is arguably more accurate, but for practical purposes these are non-signficant digits in any case and cannot be relied on. – Arne Vogel Oct 12 '17 at 15:38
  • What effect does the `std::fixed` have on the value itself? – Ron Oct 12 '17 at 15:39
16

I think Sam's comment is closest to the mark. Whereas you're using a recentish version of GCC/glibc, which implements sin() in software (calculated at compile time for the literal in question), cl.exe for x86 likely uses the fsin instruction. The latter can be very imprecise, as described in the Random ASCII blog post, "Intel Underestimates Error Bounds by 1.3 quintillion".

Part of the problem with your example in particular is that Intel uses an imprecise approximation of pi when doing range reduction:

When doing range reduction from double-precision (53-bit mantissa) pi the results will have about 13 bits of precision (66 minus 53), for an error of up to 2^40 ULPs (53 minus 13).

SloopJon
  • 283
  • 1
  • 10
  • 3
    I also think it's very relevant that in gcc the compiler is doing this calculation and with cl, it's being done at runtime by the code. I know gcc has gone to a lot of effort to make sure that the compiler gets exactly the same answers as the code it generates. But that has forced them to think carefully about issues like this, and I suspect cl has not done the same thing. – Omnifarious Oct 12 '17 at 19:04
  • 1
    Is the precision requirement underlying that blog post even "fair"? For comparison, assume we work with floating points having three significant digits. Then *pi* is represented by "3.14E0", but so is any number between 3.135 and 3.145, so that *any* output between "-3.40E-3" and "+6.59E-3" (including "0") would be the nearest representable value of sin(x) for some x whose nearest representable value is "3.14". Demanding that the output of sin(3.14) has to be "1.59E-3" appears unpractical to me ... – Hagen von Eitzen Oct 12 '17 at 21:11
  • You can do this more accurately, and with cleaner code. The value of pi is known to millions of decimal places. A clean method is to use *two* double variables, pi1 and pi2. pi1 stores the closest double precision approximation to pi, and pi2 stores the best approximation to the "error" pi - pi1. You can then reduce the range of the argument x to sin(x) or cos(x) with *no* loss of precision, by scaling pi1 and pi2 by powers of 2, and remembering that conceptually, x can be extended with an "infinite" number of binary zeros without changing its value. – alephzero Oct 12 '17 at 21:43
  • @alephzero: In what non-contrived cases would an application want the sine/cosine of a particular angle outside the range +/-π which is known with accuracy better than 0.5ulp? Much more typically, code will have angles in units of π/n radians, where n can be precisely represented as "double", and will multiply such a value by an approximation of 2π/n. Code which asks for the sine of the nearest "double" to π is much more likely to want the sine of 180 degrees than the sine of that precise numeric value. – supercat Oct 12 '17 at 22:15
  • 7
    *"cl.exe for x86 likely uses the fsin instruction"* This seems to be a very popular theory, but it's absolutely wrong. Unless you explicitly pass the `/arch:IA32` switch to *disable* SSE2 support, the 32-bit version of MSVC *assumes* SSE2 support is available and generates SSE2 instructions. It has done this for many versions now. In particular, for `sin`, it calls its library function `___libm_sse2_sin` or `__libm_sse2_sin_precise`, depending on whether you have set `/fp:fast` or `/fp:precise` (the latter is the default). None of these use `FSIN` (nor `FCOS` nor `FSINCOS`). – Cody Gray - on strike Oct 12 '17 at 23:03
  • 1
    @Cody Gray: I just tested this on a 32-bit Windows 2003 system with Windows SDK 7.0 (cl.exe version 15.00.30729.01 for 80x86). sin() and fsin both give exactly the same answer that Nicolas posted in the original question. – SloopJon Oct 13 '17 at 18:14
  • 2
    @SloopJon Version 15 of the compiler is not what I'd call a "modern" version. That was shipped with VS 2008, nearly 10 years old by now. VS 2012 (cl.exe version 17) is the version that introduced the change of which I speak, targeting SSE2 as the minimum supported instruction set even on 32-bit builds. Note that the asker is using version 19, so he's getting `/arch:SSE2` set by default for x86-32 builds unless he's specifically disabling it with `/arch:IA32`. That's why I asked about compiler switches originally. – Cody Gray - on strike Oct 13 '17 at 18:22
  • 1
    @Cody Gray: as it turns out, I didn't have to seek out an old 32-bit Windows system to reproduce this. Visual C++ 2013 (cl.exe version 18.00.40629 for x86) on 64-bit Windows 2012 also gets exactly the same results. – SloopJon Oct 13 '17 at 20:27
  • 1
    I'm not disputing what the results are. I have no reason to believe that the asker is lying, nor you. My claim is that the theory it has something to do with the `FSIN` instruction is incorrect, because that instruction is not used by modern versions of MSVC. – Cody Gray - on strike Oct 14 '17 at 10:49
  • @Cody Gray: just for information, when disassembling the exe generated by CL, I can see some calls to fsin. – Nicolas Oct 25 '17 at 10:23
9

According to cppreference:

The result may have little or no significance if the magnitude of arg is large (until C++11)

It's possible that this is the cause of the problem, in which case you will want to manually do the modulo so that arg is not large.

SirGuy
  • 10,660
  • 2
  • 36
  • 66
  • 1
    I don't think that manually doing the modulo is going to help. – Joshua Oct 12 '17 at 20:30
  • 2
    @Joshua why not? If one compiler does the modulo and the other does not that could explain the difference in results – SirGuy Oct 12 '17 at 21:04
  • Because the x86 processor does the wrong thing for long double sine function even below the modulus. What you're really testing for is which libc is correcting this fault better. – Joshua Oct 13 '17 at 03:47
  • 1
    @joshua what's the wrong thing that it does even below the modulus? Are you referring to [SloopJohn's answer](https://stackoverflow.com/a/46715682/1277769) and the linked-to article? – SirGuy Oct 13 '17 at 12:29
  • Yup that's the problem. Boo Intel. – Joshua Oct 13 '17 at 15:10