Options to make pypy run faster

Question

I wrote a test code to see how pypy can optimize python code well and run faster. It is a non-in-place quick sort and supposed to run slow enough to make the difference. By simply replacing python with pypy, the result is actually slower from 16 seconds to 25 seconds. I searched a bit and found the opt option, but I can't find a way to apply it to pypy. I'm quite new to python, so help me a bit.

import sys

def sqsort(xxs):
    if len(xxs) == 1 or len(xxs) == 0:
        return xxs
    x = xxs[0]
    xs = xxs[1 :]
    l = []
    g = []
    for x2 in xs:
        if x2 < x:
            l.append(x2)
        if x2 >= x:
            g.append(x2)
    return sqsort(l) + [x] + sqsort(g)

sys.setrecursionlimit(30000)
l = list(reversed(range(15000)))
print(l)
print(sqsort(l))

There is nothing like a magical code optimisation flag. To reduce running time you have to optimise the code. In your case you have change the algorithm in a way to prevent recursion. Python and Pypy are not good at it and that's OK since there it always an iterative way to do it. — Klaus D., Aug 02 '15 at 03:20
@KlausD. The purpose of this code is simply to compare between CPython and PyPy. I wonder why PyPy runs slower. — , Aug 02 '15 at 03:21
@KlausD. I know no compiler exists than can magically optimize the algorithm itself. Still, in this case the jit compiler can see that the operations are only applied to integers and integer lists, so it's possible to remove some of the dynamic checking of types. — , Aug 02 '15 at 03:24
As said above I suspected the recursion. To test that you would need to try an iterative variation of the algorithm. — Klaus D., Aug 02 '15 at 03:25
@KlausD. What is the problem of recursion? Both interpreters deal with the same recursive code, so it's a fair test. — , Aug 02 '15 at 03:27
The problem is overhead. It's well explained in http://stackoverflow.com/questions/3021/what-is-recursion-and-when-should-i-use-it — Klaus D., Aug 02 '15 at 03:31
@KlausD. I know what recursion is. Again, it is a fair test. — , Aug 02 '15 at 03:33
You are testing both interpreters with work they are not good at. And you are getting a result you don't seem to expect. That's an essential lesson in science: the quality of the result depends on the quality of the experiment. — Klaus D., Aug 02 '15 at 03:42
@KlausD. Recursion isn’t the problem here. Indeed, quicksort is inherently recursive, no two ways about it. Also, there’s no reason to think pypy is any worse at recursion than CPython, and neither are particularly *bad* at it (they just don’t optimise it, but this couldn’t be done here anyway). Rather, the problem here is the constant copying and allocating of memory. — Konrad Rudolph, Aug 02 '15 at 11:13

score 1 · Accepted Answer · answered Aug 02 '15 at 11:05

Not a full answer, but indeed the problem is recursion, which PyPy is not good at. Here's the same algorithm rewritten to use recursion only for the shorter of the sublists (either l or g), and iteration for the longer one. This version is still recursive, but the recursion is guaranteed to be limited to O(log(n)) times instead of O(n). It is now 4-5x faster in PyPy.

Note that we can't say that the overall time of this algorithm (in either version) is really O(n log(n)), because it is full of list concatenations which take time too. You can't treat Python's lists like you would treat Haskell's or Lisp's "cons" chained lists; in Python, lists are variable-sized array.

def sqsort(xxs):
    left, right = [], []
    while True:
        if len(xxs) == 1 or len(xxs) == 0:
            return left + xxs + right
        x = xxs[0]
        xs = xxs[1 :]
        l = []
        g = []
        for x2 in xs:
            if x2 < x:
                l.append(x2)
            if x2 >= x:
                g.append(x2)
        if len(l) <= len(g):
            left += sqsort(l) + [x]
            xxs = g
        else:
            right = [x] + sqsort(g) + right
            xxs = l

l = list(reversed(range(15000)))
print(l)
print(sqsort(l))

Why do you think Pypy is worse at recursion than CPython? In fact, while neither specifically optimise (tail) recursion, they handle normal recursion just fine. The problem in the code is more due to the unnecessary copying and allocation, which takes orders of magnitude more time than a recursive call. You’re however also wrong about the asymptotic runtime: while list concatenation is slow, it’s still O(n) and doesn’t change the asymptotic runtime of the overall algorithm. — Konrad Rudolph, Aug 02 '15 at 11:16
@KonradRudolph: maybe I'm wrong, as I didn't look at the produced assembler, but unless proven otherwise I'll stand on my point: excessive recursion is the most likely explanation. It is the main difference between the two versions of the code. The 2nd version is only slightly faster than the 1st in CPython, but the ratio is 10x or 15x in PyPy. — Armin Rigo, Aug 02 '15 at 20:03

Options to make pypy run faster

1 Answers1