Why do these optimizations seem significantly slower?

Question

I am learning Python (I have some background in other languages, notably C, etc.), and I am attempting to write some simple functions as a way to reinforce what I have read in the Python tutorial. In particular, here are my attempts at a function which determines whether a number is composite:

def isComposite(n):
    """Straight loop."""
    for x in range(2, int(math.sqrt(n)) + 1):
        if n % x == 0:
            return x
    return False

_PrimeList = [2]

def isCompPL(n):
    """Recursive version."""
    for x in _PrimeList:
        if n % x == 0:
            if n == x:
                return False
            return x
    for x in range(_PrimeList[-1], int(math.sqrt(n)) + 1):
        if not isCompPL(x):
            if x > _PrimeList[-1]:
                _PrimeList.append(x)
            if n % x == 0:
                return x
    return False

def isCompSR(n):
    """Serialized recursive version."""
    l = [n]
    while (math.sqrt(l[0]) > _PrimeList[-1]):
        l.insert(0, int(math.sqrt(l[0])))
    l.insert(0, _PrimeList[-1] + 1)
    while (len(l) > 2):
        q = l.pop([0])
        for x in range(q, l[0]):
            for y in _PrimeList:
                if x % y == 0:
                    break
            else:
                _PrimeList.append(x)
    return isCompPL(n)

isComposite(n) is orders of magnitude faster than either isCompPL(n) or isCompSR(n) when _PrimeList starts as [2]. When _PrimeList contains all the primes up to the square root of n, then both isCompPL(n) and isCompSR(n) catch up and may be faster than isComposite(n), but not significantly so. More striking to me is that if I call isCompSR(511111111111), a subsequent call isCompSR(1111111111111) is still much slower than calling isComposite(1111111111111), without clearing _PrimeList after the first call to isCompSR.

Is there something hidden about the list operations in Python that makes these attempts not successful in terms of optimization, or is it just that I'm adding a lot of extra prime testing and I need to reduce that portion somehow?

If you've already realized you're adding a lot of prime testing, I think you know the answer to your question... — alexis, Jul 16 '18 at 14:21
@alexis: not exactly. The "same" code in Lisp does not have this disparity in runtimes... — abiessu, Jul 16 '18 at 14:22
I see... Well, I think the next step should be to try [line profiling](https://stackoverflow.com/questions/3927628/how-can-i-profile-python-code-line-by-line) on your code. — alexis, Jul 16 '18 at 14:24
FWIW, one certain difference between Python and lisp is that Python does not do tail call optimisation (or tail recursion optimisation, which is the special case). — alexis, Jul 16 '18 at 14:25
I'll add just one additional point that if `n` is prime, these same timing issues persist from my testing thus far. The examples listed are not prime and thus demonstrate that many unnecessary primes get tested in the latter two functions, but these functions remain slow in the important cases as well. — abiessu, Jul 16 '18 at 15:46
You are `.insert`ing and `.pop`ing from the beginning of the list. These are linear time operations in Python, for lisp these are constant time. The lists in Python and LISP are totally different — juanpa.arrivillaga, Jul 16 '18 at 15:55
@juanpa.arrivillaga: that sounds like my main problem in this case then. Is there a better way to do the "serialized recursion" approach I'm attempting, or is there some other method that is preferred in Python? — abiessu, Jul 16 '18 at 16:10
@abiessu use a different data-strucutre. Try it with a `collection.deque` and the `.appendleft` and `popleft` methods, that are constant time. `deques` do have their own trade-offs, of course. — juanpa.arrivillaga, Jul 16 '18 at 18:09

abiessu · Accepted Answer · 2018-07-16T21:37:53.683

Both primary commenters (@alexis, @juanpa.arrivillaga) had correct assessments of the code I wrote (evaluating too many numbers, using the list data type in a inefficient manner), and with improvements on both fronts the updated "serialized recursion" function is much faster overall, and is significantly faster than the "straight loop" version when _PrimeList has been initialized. The current version of isCompSR(n) looks like this:

def isCompSR(n):
    """Serialized recursive version."""
    for x in _PrimeList:
        if n % x == 0:
            return x
    l = [n]
    while (math.sqrt(l[-1]) > _PrimeList[-1]):
        l.append(int(math.sqrt(l[-1])))
    l.append(_PrimeList[-1] + 1)
    while (len(l) > 2):
        q = l.pop()
        for x in range(q, l[-1]):
            if n % x == 0:
                return x
            for y in _PrimeList:
                if x % y == 0:
                    break
            else:
                _PrimeList.append(x)
    return False

Note that this code will now "drop out" at the first prime which divides n rather than continuing to process all the numbers up to math.sqrt(n) as before. Also, the list l is now used more appropriately by calling .append and .pop() instead of inserts and pops at the beginning.

Why do these optimizations seem significantly slower?

1 Answers1