1

first question here. I am trying to learn python by stepping through project euler, and I've run into a roadblock. The following method (returns a list of prime factors) works fine for a single call:

def findPrimeFactors(num, primeFactors = []):
    '''Find the prime factors of an arbitrary positive integer

        input: num to factorize
        returns: a list containing the prime factors of the number
    '''
    pIndex = 2

    while (num >= pIndex):
        if num % pIndex == 0:
            num /= pIndex
            primeFactors.append(pIndex)
            return FindPrimes.findPrimeFactors(num, primeFactors)

        else:
            pIndex += 1

    return primeFactors

However when I use it in a loop, like so (this method may not be complete yet, currently results in infinite loop since more primes cannot be found):

def countPrimes(n = 1001):
    '''find n amount of unique primes ascending

        input: number of primes to find
        returns: list of n primes starting from 2   '''

    primes = []
    i = 2

    while len(primes) < n:
        primeFactors = FindPrimes.findPrimeFactors(i)
        print(primeFactors) #verify method behavior

        if len(primeFactors) is 1:
            primes.append(primeFactors[0])   
        i += 1

    return primes

The result is that the first loop returns [2], the next returns [2, 3], and so on, appending the new results to the list that I Wanted to have been empty on the first recursive call. It seems that my list is persisting, but I'm not sure exactly why? I read Python Class scope & lists as well which gives me some clues but the recursion complicates it more.

Recursive also means I cannot simply assign an empty set to it either. Coming from a C++ background, my expectation was that the primeFactors variable should be reinitialized each time the function is called from my program. Still a baby snake here.

EDIT: This is the iterative version of findPrimeFactors I wrote. I know it is not optimal - but I would like to at least make it efficient enough to meet Project Euler's 1 minute rule. Any suggestions for improvement or clarity are appreciated.

PRIMES = [2,3,5,7,11,13,17,19]
import math

class FindPrimes():

    '''V2 iterative'''
    def findPrimeFactors(n, primeFactors = None):
        '''Find the prime factors of an arbitrary positive integer

            input: num to factorize
            returns: a list containing the prime factors of the number
        '''

        if primeFactors is None:
            primeFactors = []

        num = n
        ceil = math.sqrt(n) #currently unused

        global PRIMES
        knownPrimes = PRIMES

        #check known primes for divisors first, then continue searching for primes by brute force
        while True:

            factorFound = False
            for prime in knownPrimes:   

                if num % prime == 0:
                    primeFactors.append(prime)
                    num /= prime
                    factorFound = True
                    break       #ensure that the list returned has ascending primes

            if not factorFound:
                break

        #once attempts have been made to reduce using known primes
        #search for new primes if the number is not fully reduced

        i = knownPrimes[-1] + 2

        while num != 1:

            if num % i == 0:
                knownPrimes.append(i)
                primeFactors.append(i)
                num /= i

            i += 2          

        return primeFactors


    def countPrimes(n = 10001):
        '''find n amount of unique primes ascending

            input: number of primes to find
            returns: list of n primes starting from 2   '''

        primes = []
        i = 2

        while len(primes) < n:

            primeFactors = FindPrimes.findPrimeFactors(i)

            if len(primeFactors) == 1:
                primes.append(primeFactors[0])
                #print(primeFactors[-1])

            i += 1

        print(len(primes))
        return primes

nth = 10001
print(FindPrimes.countPrimes(nth)[nth-1])   #print the largest prime found
Community
  • 1
  • 1
A---
  • 2,373
  • 2
  • 19
  • 25
  • 1
    Aside: regarding 'len(primeFactors) is 1', you don't want to write that. "is" is for object identity, and Python makes no guarantees that there's only one integer object corresponding to a given number. For example, try 'len(range(257)) is 257'. Simply write len(primeFactors) == 1 instead. – DSM Jun 13 '11 at 06:12
  • BTW, see http://stackoverflow.com/questions/1651154/why-are-default-arguments-evaluated-at-definition-time-in-python for an explanation of why it works like this. – DSM Jun 13 '11 at 06:15
  • That thread makes the reasoning for default value behavior much clearer. And I have modified my code per your first suggestion. Do you mean to imply that range(257) is a number with multiple integer objects corresponding to it? – A--- Jun 13 '11 at 22:42

3 Answers3

3

See "Least Astonishment" and the Mutable Default Argument

Community
  • 1
  • 1
Gilad Naor
  • 20,752
  • 14
  • 46
  • 53
1

The default value of primeFactors is being shared between calls, so when you change it, it stays changed for future calls.

Example:

def foo(bar = []):
    bar.append(1)
    return bar

print foo()
print foo()

Output:

[1]
[1, 1]

You should return a new list instead of changing the default:

def foo(bar = []):
    return bar + [1]

print foo()
print foo()

Output:

[1]
[1]
hammar
  • 138,522
  • 17
  • 304
  • 385
1

As mentioned by hammar, the default value is only created once, when the function is defined, and shared between calls.

The usual way around that is to use a marker value as the default:

def findPrimeFactors(num, primeFactors=None):
    if primeFactors is None:
        primeFactors = []
    ...

Off-topic, but your function findPrimeFactor() will recurse once for every prime factor found. Python doesn't do tail call removal, so you should probably rewrite this using iteration instead of recursion.

Remy Blank
  • 4,236
  • 2
  • 23
  • 24
  • Thanks, I rewrote my method using iteration and incorporated your suggestion using the marker value. Actually I had no choice since my script crashed when I tried to find 10001 primes. However my method does not meet the Euler Project "1 minute rule" – A--- Jun 13 '11 at 23:45
  • Presumably it failed because it reached python's recursion depth limit? Also, could you elaborate more on tail calls? From Wikipedia: **Tail calls are significant because they can be implemented without adding a new stack frame to the call stack. Most of the frame of the current procedure is not needed any more, and it can be replaced by the frame of the tail call, modified as appropriate. The program can then jump to the called subroutine. Producing such code instead of a standard call sequence is called tail call elimination, or tail call optimization.** – A--- Jun 14 '11 at 00:22
  • Does this mean that Python creates a new stack frame for every recursive call in my method? – A--- Jun 14 '11 at 00:24
  • Yes, that is exactly what it means, and Python (by default) only lets you have 1000 stack frames. You can change the maximum stack size, but it's better to redesign your algorithm. – kindall Jun 14 '11 at 00:37
  • @kindall thanks for the clarification @RemyBlank I included the iterative version of findPrimeFactors in my code - I am trying to improve it – A--- Jun 14 '11 at 00:41