2

I solved Project Euler #14 using the following code:

import time
start_time = time.time()
def collatz_problem(n):
    count = 0
    while n!=1:
        if n%2==0:
            n = n/2
            count = count+1
        elif n%2!=0:
            n = 3*n+1
            count = count +1
    return count+1


def longest_chain():
    max_len,num = 1,1
    for i in xrange(13,1000000):
        chain_length = collatz_problem(i)
        if chain_length > max_len:
            max_len = chain_length
            num = i

    return num

print longest_chain()
print time.time() - start_time, "seconds"

The above solution took around ~35 seconds to run. Now, I tried another solution from here.

Solution:

import time
start_time = time.time()
cache = { 1: 1 }
def chain(cache, n):
     if not cache.get(n,0):
         if n % 2: cache[n] = 1 + chain(cache, 3*n + 1)
         else: cache[n] = 1 + chain(cache, n/2)
     return cache[n]
m,n = 0,0
for i in xrange(1, 1000000):
    c = chain(cache, i)
    if c > m: m,n = c,i

print n
print time.time() - start_time, "seconds"

Now, this solution only took ~3.5 seconds.

First Question:

Now, as I am a python beginner I cannot see why there is so much of a difference in these two approaches and how can I modify my code to make it more effecient.

Second Question:

While solving project euler questions is there any time constraints one should keep in mind and is my code really that ineffecient.

Cœur
  • 37,241
  • 25
  • 195
  • 267
ronnie
  • 1,799
  • 3
  • 17
  • 22

1 Answers1

8

In the first version you may calculate the length of some chains several times because they are subchains in other chains.

In the second solution you are only calculating the length of each chain once because of the caching. This optimization technique is called memoization.


A more dramatic example of memoization is calculation of the Fibonacci numbers. Here's the simple recursive solution:

def fib(n):
    if n < 2:
        return n
    else:
        return fib(n-1) + fib(n-2)

It takes exponential time because fib(n) evaluates fib(n-1) and fib(n-2), but fib(n-1) also evaluates fib(n-2), so you end up doing the exact same calculation again. Try calculating fib(35) or higher with this algorithm.

By caching the results of fib(x) for each x you can avoid recalculating the same result, improving the performance to linear time.

def fib2(n):
    if n < 2:
        return n
    elif n in cache:
        return cache[n]
    else:
        result = fib2(n-1) + fib2(n-2)
        cache[n] = result
        return result

Related

Community
  • 1
  • 1
Mark Byers
  • 811,555
  • 193
  • 1,581
  • 1,452
  • 1
    Mmm.. you've already posted an example... do you want a different example? I can show Fibonacci with memoization. – Mark Byers Jul 28 '12 at 08:22