results of sine function different between microsoft and Java / Linux

Question

I have a C++ code written for Windows (Visual Studio) which I need to port into Java. This is not very easy and currently I am stuck using the sine function in . The results given from Linux (tested to have a comparison) and Java are different as the results from the Windows source. Both results are wrong but this doesn't matter. It is important that the results are exact the same.

I will post the entire source at the bottom. For example I need to calculate the sine of 5174852443848405000.0. I know this is a very large and maybe unusual number but I can not change this. Linux and Java are returning 0.153662 and Windows something about 0.16xx. The function "random_value_genrator()" is used about 500,000 times so the differences in the result may occur later sometime.

initial_value_generator will calculate a value which is used later by the random_value_generator function. The value is generated out of a FILETIME object and three constants. Buffer overflows are occurring but not handled. The random_value_generator is modifying the DWORD64 prng_initial_value each time when used.

I was able to build the initial_value_generator function successfully.

I guess I can't complete this task but any help is appreciated.

Some global variables:

DWORD64 prng_initial_value = 0;

DWORD64 CON1_RVG = 0x4F3D859E;
double CON2_RVG = 0.946270391;

DWORD64 CON1_PRNG = 0x2682D10B7;
DWORD64 CON2_PRNG = 0x19254D38000;
DWORD64 CON3_PRNG = 0x0F1E34A09;

This function is used once at program startup. Writing a large DWORD64 into prng_initial_value, which is used later by random_value_generator(). A system time is multiplied by constant 1 (buffer overflow), divided by constant 2 and added with constant 3.

void initial_value_generator ()
{
    SYSTEMTIME systime;
    FILETIME filetime;

    // Systemzeit zu GMT-Format umwandeln
    SystemTimeToFileTime(&systime,&filetime);

    prng_initial_value = (*(DWORD64*)&filetime) * CON1_PRNG / CON2_PRNG + CON3_PRNG;
}

This function changes the DWORD64 prng_initial_value at each use.

int random_value_generator () 
{
    double sin_value;
    double copy_of_prng_initial_value;
    DWORD64 prng_con1;
    double result;

    // Initialen Wert aus dem initial_random_generator in lokaler Variable speichern
    copy_of_prng_initial_value = prng_initial_value;

    // Sinus vom initialen Wert 
    sin_value = sin(copy_of_prng_initial_value);

    // Initialen Wert mulipikation mit einem konstanten Wert (0x4F3D859E)
    prng_con1 = prng_initial_value * CON1_RVG;

Some further calculations to become insane:

    result = prng_con1 + sin_value;
    result = result * copy_of_prng_initial_value;
    result = result + CON2_RVG;
    result = result * copy_of_prng_initial_value;

    // Das Ergebnis aus der Logarithmus Rechnung addieren
    result += log(copy_of_prng_initial_value);

    // Das Ergebnis aus den Berechnungen als Pointer in die
    // Speicheradresse von prng_initial_value als double Pointer speichern.
    *(double*)&prng_initial_value = result;

    // Rueckgabe des Berechneten Wert als Integer
    return prng_initial_value;
}

For reference I post my Java code (all comments are in English). The random function looks a bit crazy because I was testing a lot of things. I am very sorry about that. But the important point is just the use of the Math.sin(double x) function which result is different that the sin function in Math.h using Microsoft C++ compiler.

private final long initialValue;
private long randomValue;
final BigInteger uint64MaxValue = new BigInteger("18446744073709551616");   //2^64

public ConfickerC() {
    this.initialValue = this.generateInitialValue();
    this.randomValue = this.initialValue;
}

private long generateInitialValue() {
    //We need the actual date without the time from GMT +0 timezone
    Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("GMT"));
    cal.set(Calendar.HOUR_OF_DAY, 0);
    cal.set(Calendar.MINUTE, 0);
    cal.set(Calendar.SECOND, 0);
    cal.set(Calendar.MILLISECOND, 0);

    long systemtimeAsFiletime = cal.getTimeInMillis();

    /*
     * Goal is to get the above created date into Windows FileTime format.
     * The Windows FileTime format has got 100 nano seconds per tick.
     * So one increment of the long value results further 100 nano seconds.
     * Instead of Unix the FileTime format begins with 1st January 1601 - not 1970.
     * 11644473600 is the interval between 1601 and 1970 in seconds.
     * Java has got a resolution of 1 ms per tick unix have got 1 second per
     * tick. So first devide the time by 1000. Then add the interval. 
     * After this we multiply by 10 million to get a resolution of 100
     * nano seconds per tick.
     */
    systemtimeAsFiletime /= 1000;        //divide by 1000 to get seconds instead of milliseconds
    systemtimeAsFiletime += 11644473600L; //add x seconds to add the interval between 1601 and 1970
    systemtimeAsFiletime *= 10000000L;   //Windows FileTime has a resolution of 100 nano seconds per tick; so multiply by 10M

    /*
     * The virus is calulating for getting the initial value: time * con1 / con2 + con3
     * Originaly there occurs a buffer overflow which is not handled in the C++ code.
     * The funny thing is that Java does not have a DWORD64 (unsinged _int64). So because of this bit missing (and so the overflow is different) we need BigInteger. 
     * Because BigInteger has no 2^64 limitation we need the maximul value of DWORD64.
     * This is used to "simulate" the buffer overflow by calculating ((time * con1) % 2^64) / con2 + con3
     * modulo 2^64 will result a value which is equal to the C++ calculation
     */

    final BigInteger CONSTANT_1 = new BigInteger("10337718455");                //Original: 0x2682D10B7
    final BigInteger CONSTANT_2 = new BigInteger("1728000000000");              //Original: 0x19254D38000
    final BigInteger CONSTANT_3 = new BigInteger("4058204681");                 //Original: 0x0F1E34A09

    BigInteger bigIntSystemTime = BigInteger.valueOf(systemtimeAsFiletime);

    //Return as long value: ((time * con1) % 2^64) / con2 + con3
    return bigIntSystemTime.multiply(CONSTANT_1).divideAndRemainder(uint64MaxValue)[1].divide(CONSTANT_2).add(CONSTANT_3).longValue();
}

private int generateRandomValue() {
    final long CONSTANT_1 = 1329431966L;
    final double CONSTANT_2 = 0.946270391;
    double result = 0.0;
    double copyOfInit = this.randomValue;

    System.out.println(System.getProperty("line.separator") + "Copy of init: " + copyOfInit);
    System.out.printf("Copy of init: %f\n", copyOfInit);
    double sinInit = Math.sin(copyOfInit); System.out.println("Sinus: " + sinInit);  
    System.out.printf("Sinus: %f\n", sinInit);
    System.out.println("Sinus gerundet: " + Math.round(sinInit*1000000)/1000000.0d);
    BigInteger b =        BigInteger.valueOf(this.randomValue).multiply(BigInteger.valueOf(CONSTANT_1)).divideAndRemainder(uint64MaxValue)[1];

    System.out.println("Init * Konstante 1: " + b);
    BigDecimal bd = new BigDecimal(b.toString());
    //bd.add(BigDecimal.valueOf(sinInit));

    //result = t + sinInit; System.out.println("Multi + Sinus: " + result);
    result = bd.add(BigDecimal.valueOf(sinInit)).doubleValue(); System.out.println("Multi + Sinus: " + result);
    result *= (long) this.randomValue; System.out.println("Zwischenergebnis * init: " + result);
    result += CONSTANT_2; System.out.println("Konstante 2 addiert: " + result);
    System.out.printf("BigD: %s", BigDecimal.valueOf(result).multiply(BigDecimal.valueOf(randomValue)));
    result *= this.randomValue; System.out.printf("Erneut mit init multipliziert: %f", result);
    double l = Math.log((long)this.randomValue); System.out.println("Log von init: " + l);
    result += l; System.out.printf("+= mit Log: %f\n", result);

    this.randomValue = (long)result; System.out.printf("Ende: %d\n", this.randomValue);

    this.randomValue = Double.doubleToRawLongBits(result);

    return (int)this.randomValue;   
}

This looks interesting. Can you simplify your code down to the minimum amount required to reproduce the error so that we can rule out the possibility that there's another bug somewhere in the code? — templatetypedef, Oct 28 '13 at 21:01
What @templatetypedef said, an SSCCE would be useful here. Also can you post your Java source to rule out there is not a possibility of an error during the port. Although I'm sure there is a possibility that the result of sin(x) may differ from one language to the next (due to calculating it slightly differently) it should not be that large. — Radiodef, Oct 28 '13 at 21:05
It makes little sense to take the sine of such a large number. Sine values are only interesting within the range [0,2*PI]. For all values, `sin(x) = sin(x % (2*PI))`. When you use such a large number, you've lost all significance in the range which affects the value of the sine calculation. You need to reconsider your algorithm. — pburka, Oct 28 '13 at 21:06
Okay. I will try to shorten the code as requested. But to understand how the large numbers are been created you might need the initial_value_generator function. Keep in mind: I need to port this software and it is important to get the same values. So it is not important to get the correct result of the sine. It is more important to get the exact result of the C++ program under Windows. This is a code of a virus which is creating domains out of this values. Goal is to block those domains later. So because of this I need the exact behavior. — Markus Schuhmacher, Oct 28 '13 at 21:12
In order to ensure that your results are reproducible, you'll want to make sure your Java class is marked `strict`. This will inhibit certain optimizations. The next task is to figure out why Windows gives a different answer than Java. One thing to try would be to explicitly reduce the number down: `Math.sin(x % (2 * Math.PI))`. Mathematically this is identical, but it may give you different error characteristics which more closely mimic Windows. — pburka, Oct 28 '13 at 21:28
If I try to reduce the number down using modulo and put this into sin I will get (as far it is accurate enough) the exact sine result of the number mentioned above but not close to the result of the Windows sine function. — Markus Schuhmacher, Oct 28 '13 at 22:12
If I use sin(fmod(5174852443848405000.0, (2.0 * PI))) the results are the same on Windows and on Linux. But I guess this won't help me or am I wrong? By the way. The result of sine of this value is 0.153662 on Linux and 0.160242 at Windows. — Markus Schuhmacher, Oct 28 '13 at 22:26
Are buffer overflows handled in a different way in Java and Windows regarding to double? — Markus Schuhmacher, Oct 29 '13 at 20:13
Check http://stackoverflow.com/questions/6665418/standard-for-the-sine-of-very-large-numbers especially K.C. Ng's "Good to the Last Bit". Disagree with @pburka concerning 'For all values, sin(x) = sin(x % (2*PI))". Though true mathematically, this is not true in programming languages for simply PI is not exactly representable (its transcendental). Ng's article goes over the issue in detail. — chux - Reinstate Monica, Oct 30 '13 at 20:01

score 2 · Answer 1 · answered Oct 28 '13 at 21:31

2

The trigonometric functions are library functions with a reasonably vague specification. For example, here is what the C standard has to say on this topic (7.12.4.6):

The sin functions

Synopsis

#include <math.h>
double sin(double x);
float sinf(float x);
long double sinl(long double x);

Description

The sin functions compute the sine of x (measured in radians).

Returns

The sin functions return sin x

As such, they will use different algorithms and different accuracy, i.e., using the library versions you won't get exactly the same results. For example, different libraries may make different trade-offs between accuracy and computation speed. Even if the library implementations were exactly the same, you probably won't get exactly the same results on different systems as the values may be rounded at different points throughout the calculations. To get reasonably close results between different platforms you may need to implement the same algorithms on these platforms.

Note, that sin(x) clearly provides the best results in the range [0, π/2]. Passing a huge number to sin(x) will probably create a fairly bad approximation although I would expect that most implementations would start off with mapping x into the range given above before doing any computations. Ideally, you'd avoid the large values right from the start and express them in terms of multiples of π instead. However, you'll likely get different results from different implementations even when x falls into the range above.

answered Oct 28 '13 at 21:31

Dietmar Kühl

150,225
13
225
380

1

Is it possible to get the implementation of the sine function which Microsoft uses in Visual Studio to try at least porting this? If a huge number passed in the sine function will result a bad approximation then this way might be a good choice by the author to make it harder to port this function into other languages. Does it makes sense to use Wine emulator executing this code if I must use Linux? Do you thing this might work? – Markus Schuhmacher Oct 28 '13 at 21:44
You may be able to get Microsoft's implementation from [Dinkumware](http://dinkumware.com): Dinkumware provided the standard C and C++ libraries for Microsoft as far as I know (BTW, looking at the fairly old version I got, it seems the first thing the implementation does is to scale large values into a preferred range). Alternatively, you might want to consistently use your own implementation, e.g., based on [Numerical Recipes](http://en.wikipedia.org/wiki/Numerical_Recipes). – Dietmar Kühl Oct 28 '13 at 21:55
1

If I understand the comment thread correctly, the goal isn't to calculate `sin(x)`, it's to reproduce Dinkumware's implementation of `sin(x)` down to the last bit, for all inputs. – MSalters Oct 28 '13 at 23:21
@MarkusSchuhmacher: you can use the [Taylor series for sin x](http://www.math.sc.edu/~girardi/m142/handouts/10sTaylorPolySeries.pdf) for the algo. – ChiefTwoPencils Oct 28 '13 at 23:54
1

@BobbyDigital: The Taylor series converges fairly slowly. Normally, a specifically constructed polynomial with just a few terms (5 or 6) is used in a fairly limited range, e.g., `[0, π/2]`: the result is as accurate (or, possibly, even more accurate) as using the Taylor series but it is a lot faster. To place the argument into the supported range, basic properties of `sin(x)` are used, e.g., `sin(x) = sin(x + 2 * π)`, `sin(x) = -sin(-x)`, `sin(x) = sin(x + π)`, etc. – Dietmar Kühl Oct 29 '13 at 00:02
@MSalters You are totally correct with your understanding. That is the point - the correct calculation of sin(x) is not important. I could try with the Taylor series - even if it take a bit - but I don't know if I gonna get this one specific value. For that I must know how the function is implemented. Maybe there are more efficient ways to calculate sine and the Taylor series is not used in the sin(x) function. But at least I could try. – Markus Schuhmacher Oct 29 '13 at 07:44
@MarkusSchuhmacher: Taylor would be a horrible choice given your arguments. You'd need a few million terms, most of which are _way_ outside the range of `double`. Besides, there are quite a few ways to implement `sin(x)` - I know, I did it. For instance, `sin(x)==x` for small x. But how do you determine small? I did so by checking exponent bits, but that's just one way. I don't know how Dinkumware did it (or even if it was a library function, could have been an intrinsic) – MSalters Oct 29 '13 at 08:24

results of sine function different between microsoft and Java / Linux

1 Answers1