2

The running time of the following program(which is inside function prime)is 110.726383227 seconds

If I run the same program without wrapping it in a function(prime) it's running time is 222.006502634 seconds

I made a significant performance improvement by wrapping it in a function.

still is there any possibility increase the efficiency of this program ?

# This is a program to find sum of prime numbers till 2 billion

def prime():
import time
start_time = time.clock()

num = 2
j=1
tot_sum = 0

for num in xrange(2,2000000): 
    count = 0
    for j in xrange(1,int(num**0.5)+1): # checking from 1 to sqrt(n)+1
        if(num % j == 0):
            count = count+1

    if(count == 1):
        tot_sum = tot_sum + num

print "total sum is %d" % tot_sum

print time.clock() - start_time, "seconds"
Mark Hildreth
  • 42,023
  • 11
  • 120
  • 109
AnV
  • 2,794
  • 3
  • 32
  • 43
  • Use `timeit` module to test program's timings not `time.clock()`. – Ashwini Chaudhary Oct 13 '13 at 19:01
  • @hcwhsa `timeit` is great, but it's impractical when a single run takes several minutes. It's designed for small snippets. And using `timeit` to time a single run is of no use (it uses a better stopwatch, but that's hardly useful at this time scale). –  Oct 13 '13 at 19:06
  • I think it's because it only has to deal with the local namespace (STORE_FAST instruction) rather than both the local and global namespace (STORE_NAME instruction). The local namespace uses registers while the global namespace stores its names and objects in RAM. I could be wrong though. – Shashank Oct 13 '13 at 19:08
  • 1
    You can tweak a few more percent out of this program's performance with tricks like that, but the real culprit here is the fundamental algorithm. A well-written prime sieve should produce the primes up to 2000000 almost instantly (certainly less than a second). Look at http://stackoverflow.com/a/18049610/2192494 – Lee Daniel Crocker Oct 13 '13 at 19:29

2 Answers2

1

If you want to solve it without external libs, there are some obvious improvements you can make:

def prime():
    import time
    start_time = time.clock()

    tot_sum = 2

    for num in xrange( 3, 2000000, 2 ): 
            isPrime = True
            for j in xrange(3, int( num ** 0.5 ) + 1, 2 ):
                if( num % j == 0 ):
                    isPrime = False
                    break

            if isPrime:
                tot_sum = tot_sum + num

    print "total sum is %d" % tot_sum

    print time.clock() - start_time, "seconds"

prime()

Not checking greater than 2 even numbers, not checking all the divisors if one found. Your original code runs on my machine in 172.924809 seconds while mine runs in 8.492169 seconds.

If using external libs is allowed, I'd suggest gmpy2:

def prime():
    from gmpy2 import is_prime
    import time
    start_time = time.clock()

    print "total sum is %d" % (sum(filter(is_prime, xrange(3,2000000,2)))+2)

    print time.clock() - start_time, "seconds"

prime()

This runs in 1.691812 seconds

Mkoch
  • 1,994
  • 2
  • 13
  • 21
0

This may have to do with how python resolves variables. Roughly, when you enter a function, python creates a new namespace, which essentially maps to all variables local to that function. Python can then use a namespace to determine the values of all variables that a programmer is accessing. The order of variable name resolution is as follows:

  • Lookup variable name in local namespace
  • Lookup variable name in global namespace
  • Lookup name in python's built-ins.

Performing lookups can be expensive, and a general performance tip in Python is to use local variables for just this reason: at minimum, it will avoid doing two lookups instead of one. Additionally, newer python compilers also seem to be doing additional optimizations to remove even the single lookup into the local namespace, but just treats the variable as an immediate value.

A good way of testing whether this optimization happens just because of namespace lookups might be (barring other optimizations that I don't know about) to wrap all your logic in a function, but to make all the variables that you're using global. If everything slows down, you might guess that it's the namespace lookup that's taking so much time.

Behram Mistree
  • 4,088
  • 3
  • 19
  • 21