6

The surprising output of the below code shows arithmetic on double is almost 100% faster than on long:

Test_DivOperator Float arithmetic measured time: 15974.5024 ms.

Test_DivOperator Integer arithmetic measured time: 28548.183 ms.

The used build settings are .Net4.5 C# 5.0 (Platform target: x64)

The used hardware is Intel Core i5-2520M (Running Windows7 64Bit)

Note: the used operator (here is division) does affect on the results, division maximizes this observation

const int numOfIterations = 1; //this value takes memory access out of the game
const int numOfRepetitions = 500000000; //CPU bound application
Random rand = new Random();
double[] Operand1 = new double[numOfIterations];
double[] Operand2 = new double[numOfIterations];
double[] Operand3 = new double[numOfIterations];

long[] Int64Operand1 = new long[numOfIterations];
long[] Int64Operand2 = new long[numOfIterations];
long[] Int64Operand3 = new long[numOfIterations];

for (int i = 0; i < numOfIterations; i++)
{
    Operand1[i]=(rand.NextDouble() * 100);
    Operand2[i]=(rand.NextDouble() * 80);
    Operand3[i]=(rand.NextDouble() * 17);
    Int64Operand1[i] = (long)Operand1[i];
    Int64Operand2[i] = (long)Operand2[i]+1;
    Int64Operand3[i] = (long)Operand3[i]+1;
}

double[] StdResult = new double[numOfIterations];
long[] NewResult = new long[numOfIterations];

TimeSpan begin = Process.GetCurrentProcess().TotalProcessorTime;

for (int j = 0; j < numOfRepetitions; j++)
{
    for (int i = 0; i < numOfIterations; i++)
    {
        double result = Operand1[i] / Operand2[i];
        result = result / Operand3[i];
        StdResult[i]=(result);
    }

}

TimeSpan end = Process.GetCurrentProcess().TotalProcessorTime;
Console.WriteLine("Test_DivOperator Float arithmetic measured time: " + (end - begin).TotalMilliseconds + " ms.");

begin = Process.GetCurrentProcess().TotalProcessorTime;

for (int j = 0; j < numOfRepetitions; j++)
{
    for (int i = 0; i < numOfIterations; i++)
    {
        long result =    Int64Operand1[i] / Int64Operand2[i];
        result = result / Int64Operand3[i];
        NewResult[i]=(result);
    }

}

end = Process.GetCurrentProcess().TotalProcessorTime;
Console.WriteLine("Test_DivOperator Integer arithmetic measured time: " + (end - begin).TotalMilliseconds + " ms.");
Community
  • 1
  • 1
Ahmed Khalaf
  • 1,220
  • 12
  • 28
  • Storing items in an array of undefined length also can introduce overhead due to reallocation. This introduces an extra variable in your test. – GolezTrol Jul 18 '15 at 13:02
  • 1
    Related reading material: [Optimizing integer divisions](http://www.codeproject.com/Articles/17480/Optimizing-integer-divisions-with-Multiply-Shift-i). – GolezTrol Jul 18 '15 at 13:07
  • @GolezTrol allocation isn't done within the measured time, all arrays are of fixed size .. also, the two loops are identical. – Ahmed Khalaf Jul 18 '15 at 13:22

1 Answers1

7

This isn't unexpected. 64bit integer division is just that slow.

Your processor is a Sandy Bridge, looking at the table of latencies and throughputs, 64bit idiv has a lot higher latency and much worse throughput than divsd.

Other microarchitectures show a similar difference.

Doing the actual math, 2.8548183E10ns / 500000000 = 57ns per iteration, at a frequency of 3.2GHz that's about 183 cycles, there are two divisions and some additional overhead so that is not weird.

For doubles, it works out to 32ns, 102 cycles, actually more than I would have expected.

harold
  • 61,398
  • 6
  • 86
  • 164
  • Thanks! When compiling for x86 (32Bit) I got better results on both operations, but integer division is still worse, does this make sense also? Test_DivOperator Float arithmetic measured time: 8283.6531 ms. Test_DivOperator Integer arithmetic measured time: 13384.8858 ms It seems the public assumption that integer arithmetic is simpler than floating point arithmetic and hence faster isn't real anymore at least for this microarchitecture, do you have further info about that? – Ahmed Khalaf Jul 19 '15 at 15:59
  • 1
    @AhmedKhalaf it's all in the table I linked to, integer arithmetic is generally fast, just not division. 32bit division isn't nearly as bad as 64bit division, but float division is also faster than double division so it still wins. – harold Jul 19 '15 at 16:11
  • I meant why would integer division be slower than floating-point? (for comparable number of bits 64Bit-vs-double). On paper, the integer division is simpler than floating point. Additionally, experimental results for the above code using addition instead gives double 5740.8368 ms. VS 6957.6446 ms for 64bit integer (compiling for x86 not x64). – Ahmed Khalaf Jul 19 '15 at 16:21
  • @AhmedKhalaf integers have more bits to divide than a similarly sized floating point number (64bit division really divides a 128bit number by a 64bit number, a double only has 53 bits to divide), and division is largely a sequential process. Also floating point division seems to be favoured somehow, being directly built in with its own µops whereas integer division uses a bunch of µops. On top of that there's also overhead that the JIT puts there. – harold Jul 19 '15 at 16:44
  • assuming this reasoning is true for division, how does that still make sense for addition? are you aware of similar comparison based on C++/unmanaged code? – Ahmed Khalaf Jul 20 '15 at 12:55
  • @AhmedKhalaf 64bit addition is a bit trickier to implement in 32bit mode, it should be faster than double addition in 64bit mode. – harold Jul 20 '15 at 12:56
  • I agree, 64-bit addition is trickier in 32-bit mode, but I don't see how this correlates to double arithmetic being faster than long? In addition, if I compile for 64bit the results become slower for both 8143.2522 ms. for double VS 8158.8523 ms. for long wouldn't this be surprising ? – Ahmed Khalaf Jul 20 '15 at 18:31
  • Okay, will try C++/Assembly and see how it goes :D – Ahmed Khalaf Jul 20 '15 at 18:52