6

I was worried about C#'s speed when it deals with heavy calculations, when you need to use raw CPU power.

I always thought that C++ is much faster than C# when it comes to calculations. So I did some quick tests. The first test computes prime numbers < an integer n, the second test computes some pandigital numbers. The idea for second test comes from here: Pandigital Numbers

C# prime computation:

using System;
using System.Diagnostics;

class Program
{

    static int primes(int n)
    {

        uint i, j;
        int countprimes = 0;

        for (i = 1; i <= n; i++)
        {
            bool isprime = true;

            for (j = 2; j <= Math.Sqrt(i); j++)

                if ((i % j) == 0)
                {
                    isprime = false;
                    break;
                }

            if (isprime) countprimes++;
        }

        return countprimes;
    }



    static void Main(string[] args)
    {
        int n = int.Parse(Console.ReadLine());
        Stopwatch sw = new Stopwatch();

        sw.Start();
        int res = primes(n);
        sw.Stop();
        Console.WriteLine("I found {0} prime numbers between 0 and {1} in {2} msecs.", res, n, sw.ElapsedMilliseconds);
        Console.ReadKey();
    }
}

C++ variant:

#include <iostream>
#include <ctime>
#include <cmath>

int primes(unsigned long n) {
unsigned long i, j;
int countprimes = 0;
  for(i = 1; i <= n; i++) {
      int isprime = 1;
      for(j = 2; j < sqrt((float)i); j++) 
          if(!(i%j)) {
        isprime = 0;
        break;
   }
    countprimes+= isprime;
  }
  return countprimes;
}

int main() {
 int n, res;
 cin>>n;
 unsigned int start = clock();

 res = primes(n);
 int tprime = clock() - start;
 cout<<"\nI found "<<res<<" prime numbers between 1 and "<<n<<" in "<<tprime<<" msecs.";
 return 0;
}

When I ran the test trying to find primes < than 100,000, C# variant finished in 0.409 seconds and C++ variant in 0.614 seconds. When I ran them for 1,000,000 C# finished in 6.039 seconds and C++ in about 12.987 seconds.

Pandigital test in C#:

using System;
using System.Diagnostics;

class Program
{
    static bool IsPandigital(int n)
    {
        int digits = 0; int count = 0; int tmp;

        for (; n > 0; n /= 10, ++count)
        {
            if ((tmp = digits) == (digits |= 1 << (n - ((n / 10) * 10) - 1)))
                return false;
        }

        return digits == (1 << count) - 1;
    }

    static void Main()
    {
        int pans = 0;
        Stopwatch sw = new Stopwatch();
        sw.Start();

        for (int i = 1; i <= 123456789; i++)
        {
            if (IsPandigital(i))
            {
                pans++;
            }
        }
        sw.Stop();
        Console.WriteLine("{0}pcs, {1}ms", pans, sw.ElapsedMilliseconds);
        Console.ReadKey();
    }
}

Pandigital test in C++:

#include <iostream>
#include <ctime>

using namespace std;

int IsPandigital(int n)
    {
        int digits = 0; int count = 0; int tmp;

        for (; n > 0; n /= 10, ++count)
        {
            if ((tmp = digits) == (digits |= 1 << (n - ((n / 10) * 10) - 1)))
                return 0;
        }

        return digits == (1 << count) - 1;
    }


int main() {
   int pans = 0;
   unsigned int start = clock();

   for (int i = 1; i <= 123456789; i++)
   {
      if (IsPandigital(i))
      {
        pans++;
      }
   }
   int ptime = clock() - start;
   cout<<"\nPans:"<<pans<<" time:"<<ptime;  
   return 0;
}

C# variant runs in 29.906 seconds and C++ in about 36.298 seconds.

I didn't touch any compiler switches and both C# and C++ programs were compiled with debug options. Before I attempted to run the test I was worried that C# will lag well behind C++, but now it seems that there is a pretty big speed difference in C# favor.

Can anybody explain this? C# is jitted and C++ is compiled native so it's normal that a C++ will be faster than a C# variant.

Thanks for the answers!

I've redid all tests for the Release configuration.

First test (prime numbers)

C# (numbers < 100,0000): 0.189 seconds C++ (numbers < 100,0000): 0.036 seconds

C# (nummbers < 1,000,000): 5.300 seconds C++ (nummbers < 1,000,000): 1.166 seconds

Second test (pandigital numbers):

C#: 21.224 seconds C++: 4.104 seconds

So, everytthing have changed, now C++ is much faster. My mistake is that I've ran the test for Debug configuration. Can I see some speed improvement if I ran the C# executables through ngen?

The reason I tried to compare C# and C++ is because I know some basics of both and I wanted to learn an API dealing with GUI. I was thinking that WPF is nice so given that I'm targeting the desktop, I wanted to see if C# can deliver enough speed and performance when it comes to use sheer CPU power to compute various calculations (file archivers, cryptography, codecs etc). But it seems that sadly C# can't keep pace with C++ when it comes to speed.

So, I'm assuming I will be forever stuck with this question Tough question on WPF, Win32, MFC, and I'll newer find an suitable API.

Community
  • 1
  • 1
Mack
  • 783
  • 1
  • 8
  • 13
  • 5
    "Can anybody explain this? C# is jitted and C++ is compiled native so it's normal that a C++ will be faster than a C# variant." <-- Not under debug mode. – Billy ONeal Mar 23 '10 at 00:49
  • 6
    "...C++ programs were compiled with debug options." So then why are we worrying about performance? – GManNickG Mar 23 '10 at 00:50
  • 4
    ` = 2; j < (i^(1/2)); j++)` This code is wrong. You're doing a bitwise - or, not an exponentiation. – Billy ONeal Mar 23 '10 at 00:50
  • 5
    When you fix everything and run in release mode, post your results. – spender Mar 23 '10 at 00:55
  • Very similar to http://stackoverflow.com/questions/686483/c-vs-c-big-performance-difference – Matthew Flaschen Mar 23 '10 at 01:05
  • 1
    Also, as written (once the C++ is fixed) it may well calculate `sqrt(i)` on every loop iteration, in which case you'll only really be comparing square-root implementations. – Mike Seymour Mar 23 '10 at 01:05
  • @Matthew Flaschen: Similar, but not identical. The other question was dealing primarily with floating-point operations while this one is integer-oriented. Although the basic premise *is* still the same. – Aaronaught Mar 23 '10 at 01:08
  • @BillyONeal , I'm ashamed for my mistake, I've corrected it. – Mack Mar 23 '10 at 01:10
  • 1
    @Mack: No reason to be ashamed. We all make mistakes :P Did you turn optimizations on for your retimings? – Billy ONeal Mar 23 '10 at 01:14
  • 1
    Redid the tests. All have changed. – Mack Mar 23 '10 at 01:36
  • If you are looking for a C++ UI library, do not discount Qt. It is very nice to work with. – Swoogan Sep 15 '14 at 19:20
  • 1
    just because C++ is faster doesn't mean you can't use C# for your UI or even the entire application. Just because it is slower, does not mean it can't satisfy your requirements. I would suggest trying to do what you can in C# and if the performance is what you desire after optimizing your logic (to include multi-threading where possible) then you can write some C++ libraries and invoke the needed methods from your C# application. – xtreampb Mar 01 '17 at 13:54

10 Answers10

12

You need to compile C++ in release mode and enable optimiziations to get the performance results you are looking for.

Axel Gneiting
  • 5,293
  • 25
  • 30
11

the prime generator in C++ is not correct

i^(1/2) == i xor 0

^ is the bitwise xor operator and / is integer division.

1st edit, it's correct but ineffecient: Since i xor 0 == i, the sieve doesn't stop at sqrt(i) but at i.

2nd edit:

The sieving can be done a little bit more efficient. (You only need to compute sqrt(n)). This is how I implemented the Sieve of Eratosthenes for my own use (this is in C99 though):

void sieve(const int n, unsigned char* primes)
{
        memset(primes, 1, (n+1) * sizeof(unsigned char));

        // sieve of eratosthenes
        primes[0] = primes[1] = 0;
        int m = floor(sqrt(n));
        for (int i = 2; i <= m; i++)
                if (primes[i]) // no need to remove multiples of i if it is not prime
                        for (int j = i; j <= (n/i); j++)
                                primes[i*j] = 0;
}
sisis
  • 936
  • 7
  • 12
  • I wonder, though, whether or not this actually makes it run slower, because the C# version has to actually evaluate `Math.Sqrt(i)` on every iteration, which is many many times more expensive than the single-instruction XOR, so this little mistake might actually make the C++ version *faster* because both versions of the program are inefficient. – Aaronaught Mar 23 '10 at 01:11
  • 2
    @Aaronaught: Except the C# version is looping to Sqrt(i) and the c++ version is looping to i. – Billy ONeal Mar 23 '10 at 01:12
  • @BillyONeal: Completely true, for large values of `n` the C++ version will always be slower because it loops more. What I meant was that the benchmark has been made sensitive to `n` and it shouldn't be. – Aaronaught Mar 23 '10 at 01:22
  • @Aaronaught: I doubt C# will recompute it. Nothing in the loop will change it. Contrarily, it should be cached in C++ explicitly. – GManNickG Mar 23 '10 at 01:27
  • @Aaronaught: True, but I think the example data size (1,000,000) qualifies as "large N" in this case :) – Billy ONeal Mar 23 '10 at 01:27
  • @GMan: *You* may know that nothing in the loop will change it, but the C# compiler cannot make that assumption. It *will* recompute it on every iteration. If you don't believe me, try it out. – Aaronaught Mar 23 '10 at 01:52
  • @Aaronaught: I find that surprising. At least in C++, if the definition of `sqrt` were visible it would never recalculate it. – GManNickG Mar 23 '10 at 02:03
  • @GMan: That is why I brought the point up; any reliance on specific compiler optimizations may skew the results. It's important to have a fair test. – Aaronaught Mar 23 '10 at 02:06
9

Why would you assume that jitted code is slower than native code? The only speed penalty would be the actual jitting, which only happens once (generally speaking). Given a program with a 30-second running time, we are talking about a minuscule portion of the total cost.

I think you may be confusing jitted code with interpreted code, which is compiled line-by-line. There's a pretty significant difference between the two.

As others have pointed out, you also need to run this in release mode; debug mode turns off most optimizations, so both versions will be slower than they should be (but by different amounts).

Edit - I should point out one other thing, which is that this line:

for (j = 2; j <= Math.Sqrt(i); j++)

Is incredibly inefficient and may interfere with the benchmarking. You should be calculating Math.Sqrt(i) outside of the inner loop. It's possible that this will slow down both versions by an equivalent amount, but I'm not sure, different compilers will perform different optimizations.

Aaronaught
  • 120,909
  • 25
  • 266
  • 342
  • 1
    +1. C++ suffers more from the optimizations turned off because it relies on inlining to be performant. Also, the JIT is going to perform some optimizations on C# even if the binary was built in debug mode. – Billy ONeal Mar 23 '10 at 00:54
  • 2
    Not only does it turn off most optimizations, it may add a bunch of extraneous error checking. Visual Studio, for example, adds overflow checking, heap checking, and various other exciting things in the default debug mode configuration. – kibibu Mar 23 '10 at 00:56
6

It's taking so much longer because the algorithm is wrong.

for(j = 2; j < (i^(1/2)); j++) 

is the same as

for(j = 2; j < (i^0); j++) 

is the same as

for(j = 2; j < i; j++) 

i is a lot bigger than sqrt(i). Looking at just running time, it's an order of magnitude larger that it should be in the C++ implementation.

Also, like everybody else is saying, I don't think it makes sense to do performance testing in debug mode.

Chris
  • 6,642
  • 7
  • 42
  • 55
3

Recompile the C++ program with full optimizations turned on and rerun the tests. The C# jit will optimize the code when its jitted so you compared optimized C#/.NET code to unoptimized C++.

Brian Ensink
  • 11,092
  • 3
  • 50
  • 63
2

First, never do such benchmarks in debug mode. To get meaningful numbers always use release mode.

The JIT has the advantage of knowing the platform it runs on, while precompiled code may be suboptimal for the platform it is running on.

Lucero
  • 59,176
  • 9
  • 122
  • 152
2

It is a persistent myth that the JIT compiler in managed code generates machine code that is a lot less efficient than the one generated by a C/C++ compiler. Managed code usually wins on memory management and floating point math, C/C++ usually wins when the code optimizer can spend a lot more time optimizing code. In general, managed code is about 80% but it completely depends on the 10% of the code where a program spends 90% of its time.

Your test won't show this, you didn't enable the optimizer and there isn't much to optimize.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • "Managed code usually wins on memory management" <-- Because C/C++ programmers are typically lazy and just use giant buffers instead of something like a `std::vector` which is common in JIT'd languages. The speeds of both the compiled code and the JIT'd code should be about the same, all said and done, so long as you discount the time it takes to load the jitter into memory (which can be quite large in Java's case :P ) +1 – Billy ONeal Mar 23 '10 at 01:06
  • 1
    The managed code can hardly win on floating point operations because the current .Net JIT doesn't emit any SSE instructions. – Jasper Bekkers Mar 23 '10 at 01:47
  • There's more than one, the x64 one does. Example: http://stackoverflow.com/questions/686483/c-vs-c-big-performance-difference/687741#687741 – Hans Passant Mar 23 '10 at 02:11
  • It seems that x64 JIT is a totally different when it comes to conception: " The 32-bit JIT and the 64-bit JIT were implemented by two different teams within Microsoft using two different code bases. The 32-bit JIT was developed by the CLR team, whereas the 64-bit JIT was developed by the Visual C++ team and is based on the Visual C++ code base. Because the 64-bit JIT was developed by the C++ team, it is more aware of C++-related issues." – Mack Mar 23 '10 at 02:43
  • Yes, the x86 JITter dates from ~1998. Was SSE2 around then? x64 dates from ~2003 and is indeed done by a different team. Not the C++ team. It never generates code from a C++ program, it generates code from IL. – Hans Passant Mar 23 '10 at 03:01
0

Both tests are invalid because you've compiled without optimizations.

The first test is meaningless even as a comparrison of unoptimized behaviour, because of an error in your code; Math.Sqrt(i) returns the square root of i, i^(1/2) returns i - so the C++ is doing much more work than the C#.

More generally, this isn't a useful thing to do - you're trying to create a synthetic benchmark that has little to no relevence to real world usage.

JoeG
  • 12,994
  • 1
  • 38
  • 63
0

guys, before comparing program speed one to another please bother to read several articles on cpu instructions, assembly, cache management etc. And the author is just a ridiculously funny buddy. Checking the performance of a debug builds.

Billy O'Neal - what is the difference between allocating a big buffer and using only small part of it and using dynamically allocated thing like vector in low language words ? Once big buffer been allocated - nobody bothers about the unused stuff. No further support operations needed. While for dynamic things like vector - constant checking of memory bounds required no to outrun of it. Remember, C++ programmers are not just lazy (which is quire true I admit), but they're also smart.

0

How about this:

for(sqrti = 1; sqrti <= 11112; sqrti++) {
  int nexti = (1+sqrti)*(1+sqrti)
  for (i = sqrti*sqrti; i < nexti; i++)
  {
    int isprime = 1;
    for(j = 2; j < sqrti; j++) 
        if(!(i%j)) {
      isprime = 0;
      break;
    }
  }

} countprimes+= isprime; }

Chris
  • 6,642
  • 7
  • 42
  • 55