0

The following is my answer to Project Euler's third problem. Here's the python implementation:

def isPrime(n):
    for i in range(n/2)[3::2]:
        if n % i == 0:
            return False
    return True

def largestPrime(n):
    for i in range(3, n/2):
        if n % i == 0 and isPrime(n/i):
            return n/i
    return -1

largestPrime(600851475143)

And here's the same implementation rewritten in C++:

#include <iostream>
using namespace std;

bool isPrime(long n)
{
    for(int i = 3; i < n/2; i+=2)
    {
        if(n % i == 0)
            return false;
    }
    return true;
}

long largestPrime(long n)
{
    for(int i = 2; i < n/2; i++)
    {
        if(n % i == 0 and isPrime(n/i))
            return n/i;
    }
    return -1;
}

int main()
{
    cout << largestPrime(600851475143) << endl;
    return 0;
}

When I run the C++ code, it computes the correct answer (6857) within seconds. But when I run the python code, it seems to go on forever. What's is it that python performance is so poor here?

hmir
  • 1,333
  • 2
  • 17
  • 24
  • `Python` is the guy who opens the door and gets shot. `C++` is the one who knocks. – tourniquet_grab Mar 24 '15 at 04:23
  • I'm not familiar with python, but does `range` work with 64 bit integers? – Matthew Mar 24 '15 at 04:23
  • What's `sizeof(long)` for your C++ program? If it's 32 bits, the number you're passing to `largestPrime` will be truncated, and if the answer's correct it's likely the the algorithm coincidentally doesn't need the the higher order bits, but it might not work reliably for all other numbers. Python - on the other hand - uses something logically akin to the C++ GMP library to handle arbitrary-length integers, albeit much slower. – Tony Delroy Mar 24 '15 at 04:23
  • Here's a hint: Suppose a*b=q, so q is not a prime. We can assume that `a<=b`. In that case `a*a<=q`. Why then do we need to check `q/2`? – rici Mar 24 '15 at 04:39
  • @rici: The question isn't "why is my algorithm slow", it is "why is the same algorithm so much slower in my Python version". (But yes, checking up to `sqrt(n)` is the general practice.) – Amadan Mar 24 '15 at 04:54
  • @amadan: that's why the above is a comment and not an answer. – rici Mar 24 '15 at 05:00
  • 3
    You have to use `xrange` instead of `range` in the Python code. Otherwise, you will get a huge runtime only due to huge memory allocations for those large ranges. The `xrange` generates the numbers as you go, as does the C++ for-loop. With that change, when I run the code, I consistently see the Python code being **5 times slower** than C++, which is expected and matches other simple benchmark tests like these that I have seen before. – Mikael Persson Mar 24 '15 at 05:25
  • @MikaelPersson: Wow, that did it. While the C++ code is still faster, the python code now finishes in 8 seconds. Good call. – hmir Mar 24 '15 at 14:46

3 Answers3

2

Its because Python is an interpreted language while C++ is a compiled language.

See this stack overflow question for a more in-depth description of the differences between the two. Note that this touches the surface of it.

Compiled vs. Interpreted Languages

Also refer to this page on the Python site for brief descriptions of Python compared to other programming languages.

https://www.python.org/doc/essays/comparisons/

Community
  • 1
  • 1
Cuban Pete
  • 111
  • 1
  • 8
2

A) Python is interpreted, C is compiled. The latter class is almost always faster than the former.

B) isPrime is executed a number of times. In it, you have range(n/2)[3::2], which will construct (a rather large) array many times. In contrast, your C code only has a simple loop, without memory allocation and initialisation.

C) Tony D's question comment might well have merit. Check your long size.

Amadan
  • 191,408
  • 23
  • 240
  • 301
  • 1
    `cout`ing the `sizeof(long)` prints out `8` to the console. I also ran the code using `long long` and it took about the same amount of time. – hmir Mar 24 '15 at 04:31
  • Then it's not C). Try to replace `range(n/2)[3::2]` with `range(3, n/2, 2)`, which should give you an iterator without constructing a list to eliminate B). There's nothing that can be done about A)... except changing the algorithm (e.g. memoizing `isPrime` results as you go so you don't have to recalculate them all the time). – Amadan Mar 24 '15 at 04:36
  • Nope - same result. I even tried using `while i < n/2`. I assumed C++ was faster but I didn't think the fact that it was a compiled language could affect performance so drastically. – hmir Mar 24 '15 at 04:41
  • @hmir: you might develop a better feel for this by looking at online comparisons - e.g. http://benchmarksgame.alioth.debian.org/u64q/which-programs-are-fastest.html – Tony Delroy Mar 24 '15 at 04:58
0

Python is, as previously mentioned, a interpreted language, but another drawback is you are more than likely running it through an IDE such as IDLE or IDLE3. These are made With Python and can take up more cpu usage than a compiled language, such as C++. I hope this helped in any way in your favor. ~Th3T3ch13