17

I have a simple C# and C++ code that computes a sum of dot products.

The C# code is:

using System;

namespace DotPerfTestCS
{
    class Program
    {
        struct Point3D
        {
            public double X, Y, Z;

            public Point3D(double x, double y, double z)
            {
                X = x;
                Y = y;
                Z = z;
            }
        }

        static void RunTest()
        {
            unchecked
            {
                const int numPoints = 100000;
                const int numIters = 100000000;

                Point3D[] pts = new Point3D[numPoints];
                for (int i = 0; i < numPoints; i++) pts[i] = new Point3D(i, i + 1, i + 2);

                var begin = DateTime.Now;
                double sum = 0.0;
                var u = new Point3D(1, 2, 3);
                for (int i = 0; i < numIters; i++)
                {
                    var v = pts[i % numPoints];
                    sum += u.X * v.X + u.Y * v.Y + u.Z * v.Z;
                }
                var end = DateTime.Now;
                Console.WriteLine("Sum: {0} Time elapsed: {1} ms", sum, (end - begin).TotalMilliseconds);
            }
        }

        static void Main(string[] args)
        {
            for (int i = 0; i < 5; i++) RunTest();
        }
    }
}

and the C++ is

#include <iostream>
#include <vector>
#include <time.h>

using namespace std;

typedef struct point3d
{
    double x, y, z;

    point3d(double x, double y, double z)
    {
        this->x = x;
        this->y = y;
        this->z = z;
    }
} point3d_t;

double diffclock(clock_t clock1,clock_t clock2)
{
    double diffticks=clock1-clock2;
    double diffms=(diffticks*10)/CLOCKS_PER_SEC;
    return diffms;
}

void runTest()
{
    const int numPoints = 100000;
    const int numIters = 100000000;

    vector<point3d_t> pts;
    for (int i = 0; i < numPoints; i++) pts.push_back(point3d_t(i, i + 1, i + 2));

    auto begin = clock();
    double sum = 0.0, dum = 0.0;
    point3d_t u(1, 2, 3);
    for (int i = 0; i < numIters; i++) 
    {
        point3d_t v = pts[i % numPoints];
        sum += u.x * v.x + u.y * v.y + u.z * v.z;
    }
    auto end = clock();
    cout << "Sum: " << sum << " Time elapsed: " << double(diffclock(end,begin)) << " ms" << endl;

}

int main()
{
    for (int i = 0; i < 5; i++) runTest();
    return 0;
}

The C# version (Release x86 with optimization on, x64 is even slower) output is

Sum: 30000500000000 Time elapsed: 551.0299 ms 
Sum: 30000500000000 Time elapsed: 551.0315 ms 
Sum: 30000500000000 Time elapsed: 552.0294 ms
Sum: 30000500000000 Time elapsed: 551.0316 ms 
Sum: 30000500000000 Time elapsed: 550.0315 ms

while C++ (default VS2010 Release build settings) yields

Sum: 3.00005e+013 Time elapsed: 4.27 ms
Sum: 3.00005e+013 Time elapsed: 4.27 ms
Sum: 3.00005e+013 Time elapsed: 4.25 ms
Sum: 3.00005e+013 Time elapsed: 4.25 ms
Sum: 3.00005e+013 Time elapsed: 4.25 ms

Now I would expect the C# code would be a little slower. But 130 times slower seems way too much to me. Can someone please explain to me what is going on here?

EDIT

I am not a C++ programmer and I just took the diffclock code somewhere from the internet without really checking if it's correct.

Using std::difftime the C++ results are

Sum: 3.00005e+013 Time elapsed: 457 ms
Sum: 3.00005e+013 Time elapsed: 452 ms
Sum: 3.00005e+013 Time elapsed: 451 ms
Sum: 3.00005e+013 Time elapsed: 451 ms
Sum: 3.00005e+013 Time elapsed: 451 ms

which seems about right.

H H
  • 263,252
  • 30
  • 330
  • 514
Dave
  • 462
  • 4
  • 10
  • 4
    in the C# sample your test is taking the JITer cost into account. That's typically not done. Run the method once to get the JIT out of the way then run the test again and compare numbers – JaredPar Oct 14 '11 at 18:30
  • 1
    It only uses JIT the 1st time RunTest() is called. And it is called 5 times... – Dave Oct 14 '11 at 18:31
  • 7
    You shouldn't use `DateTime.Now` for timing things like this. Look at the `Stopwatch` class instead. (Although I doubt that is the explanation of a difference as large as this) – DeCaf Oct 14 '11 at 18:33
  • @Dave: Yeah, so it still skews the numbers. If it runs for 500ms the first time and then 20ms each time after you have a huge error in the average run time. – Ed S. Oct 14 '11 at 18:35
  • 3
    @Ed S. I am not computing the average there... – Dave Oct 14 '11 at 18:37
  • 2
    Are you running it in Release Mode WITHOUT the debugger attached? (CTRL+F5 from Visual Studio or run directly from console) – xanatos Oct 14 '11 at 18:39
  • Technically in C# you should use `List<>` instead of an array. This will slow the program a little more :-) (the `vector<>` of C++ grows dynamically like the `List<>`) – xanatos Oct 14 '11 at 18:45
  • You should not use `clock ()` in C++ for measuring performance because it yields an approximation, which is no good at all... –  Oct 14 '11 at 18:47
  • @Vlad: technically all timing functions give you an approximation. :) – jalf Oct 14 '11 at 18:47
  • @Dave: So what *are* you doing? Summing them is no better. – jalf Oct 14 '11 at 18:49
  • Your C++ performance seems strange. Even if one iteration of the loop takes a single cycle one would expect 30ms for the whole loop. – CodesInChaos Oct 14 '11 at 18:49
  • 1
    I propose a new title. "Huge understandability gap between C# and C++" – agent-j Oct 14 '11 at 18:51
  • 1
    @jalf: yes, but what `clock ()` gives you is useless... –  Oct 14 '11 at 18:51
  • @HenkHolterman Though it turned out the "huge gap" was due to incorrect timing code and the title wasn't that good anyway, this is no reason to edit away the question title into a completely different meaning. – Christian Rau Oct 14 '11 at 19:40
  • @christian - Why not? The title is the main search string, which one will give the more useful hits? – H H Oct 14 '11 at 20:01
  • @HenkHolterman But completely changing the semantics of the question is not a good idea. He had the question why there is such a huge difference and not how to measure time, even if this was the answer to his question. But I agree that his title is a bit broad and informal. I won't rollback your edit again, either, which would be quite silly, just go ahead. – Christian Rau Oct 14 '11 at 20:12
  • "But completely changing the semantics of the question is not a good idea" - a) I didn't do that, I think I fixed what was broken. – H H Oct 14 '11 at 20:13
  • @Dave: I know, which makes it an even worse way to gauge performance. – Ed S. Oct 14 '11 at 21:06

3 Answers3

13

Your diffclock code is wrong.

If you change your C++ code to use the std::clock and std::difftime it appears to show the actual runtime:

#include <iostream>
#include <vector>
#include <ctime>

using namespace std;

typedef struct point3d
{
    double x, y, z;

    point3d(double x, double y, double z)
    {
        this->x = x;
        this->y = y;
        this->z = z;
    }
} point3d_t;

void runTest()
{
    const int numPoints = 100000;
    const int numIters = 100000000;

    vector<point3d_t> pts;
    for (int i = 0; i < numPoints; i++) pts.push_back(point3d_t(i, i + 1, i + 2));

    auto begin = clock();
    double sum = 0.0, dum = 0.0;
    point3d_t u(1, 2, 3);
    for (int i = 0; i < numIters; i++) 
    {
        point3d_t v = pts[i % numPoints];
        sum += u.x * v.x + u.y * v.y + u.z * v.z;
    }
    auto end = clock();
    cout << "Sum: " << sum << " Time elapsed: " << double(std::difftime(end,begin)) << " ms" << endl;

}

int main()
{
    for (int i = 0; i < 5; i++) runTest();
    return 0;
}

Results:

Sum: 3.00005e+013 Time elapsed: 346 ms
Sum: 3.00005e+013 Time elapsed: 344 ms
Sum: 3.00005e+013 Time elapsed: 346 ms
Sum: 3.00005e+013 Time elapsed: 347 ms
Sum: 3.00005e+013 Time elapsed: 347 ms

That is running the application in default release mode optimizations, outside of vs2010.

EDIT

As others have pointed out, in C++ using clock() is not the most accurate way to time a function (as in C#, Stopwatch is better than DateTime).

If you're using windows, you can always use the QueryPerformanceCounter for high-resolution timing.

Community
  • 1
  • 1
Christopher Currens
  • 29,917
  • 5
  • 57
  • 77
  • 2
    Ah yes, thank you. I guess this is what happens when you copy paste some code from internet without thinking about it. – Dave Oct 14 '11 at 18:47
6

I believe you will find your diffclock implementation yields deciseconds, not milliseconds (assuming CLOCKS_PER_SECOND is accurately named). Correcting this the C# implementation runs approximately 30% slower, which seems appropriate.

J Cracknell
  • 3,498
  • 1
  • 19
  • 13
0

The most obvious cause would be JIT, but once it is verified to not be the cause, I have another explanation.

"new Point3D" occurs 100000 times. This is 100000 heap allocations that are then freed later. In the C++ version, vector is also heap based, meaning when it grows, there is a realloc. But when vector grows, it grows by much more than one point3d_t each time. I expect only 30 or so realloc calls in the C++ version.

VoidStar
  • 5,241
  • 1
  • 31
  • 45
  • That was my first guess too, but the allocation is all before the time starts (except for `var u = new Point3D(1, 2, 3);`, but one allocation shouldn't throw it off that much). – Brendan Long Oct 14 '11 at 18:39
  • 1
    Point3d is a struct, so it's not really a heap allocation. The array is in the heap, but that's a single allocation. – agent-j Oct 14 '11 at 18:39
  • But this isn't part of the timing. – DeCaf Oct 14 '11 at 18:42