Why does the PyPy implementation run much slower than python?

Question

I'm writing a BFS-based code where the closest distance from the boxes marked 1 have to be calculated. The distance of the boxes marked 1 are 0.

This is a problem from Google kickstart 2019 round A: Parcels.

Link: https://codingcompetitions.withgoogle.com/kickstart/round/0000000000050e01/000000000006987d

import time 
def multibfs (office, r, c):

    #office is a 2-D list with 0's and 1's
    visited = [[False]*c for x in range (r)] 
    values = [[1e9]*c for x in range (r)] #closest distance

    queue = [] #Bfs queue

    for i in range (r): 
        for j in range (c): 
            if office[i][j]==1: 
                visited[i][j]=True 
                values[i][j]=0 
                queue.append([i,j]) 

    while queue: 
        s = queue.pop(0)
        iPos = s[0]
        jPos = s[1]

        for i in range (iPos-1, iPos+2):
            if i in range (0, r): 
                if visited[i][jPos] == False:
                    queue.append([i, jPos])
                    visited[i][jPos] = True
                    values[i][jPos] = values[iPos][jPos] + 1

        for j in range (jPos-1, jPos+2):
            if j in range (0, c):
                if visited[iPos][j] == False:
                    queue.append([iPos, j])
                    visited[iPos][j] = True
                    values[iPos][j] = values[iPos][jPos] + 1

    return values 

tic=time.time() 
o = [[1]*250 for x in range (250)]
val = multibfs(o, 250, 250) 
print(time.time()-tic)

for r and c equal to 250, and office having only 1's, the execution time for python3.6 is roughly 0.7s. I expected the PyPy implementation to be much faster, but it takes much longer

Does it return a result a eventually? If it doesn't, then I would say it is simply a bug in PyPy. Also, keep in mind that in order to reap the benefits of dynamic adaptive optimizations such as employed in PyPy, the code usually needs to run at least a couple of seconds, so even if PyPy did not have this problem, the results still wouldn't tell you much. — Jörg W Mittag, Jun 16 '19 at 06:32
Note that writing benchmarks is really hard. Writing benchmarks is a discipline of its own, and the people who write benchmarks usually do nothing else. And even they occasionally get it wrong. You should make sure you read and understand, for example https://stackoverflow.com/a/513259/2988 and https://groups.google.com/forum/#!msg/mechanical-sympathy/icNZJejUHfE/BfDekfBEs_sJ before writing a benchmark, also some basic knowledge of statistics helps. (Note, those links are for HotSpot, but most of it applies to any modern high-performance VM with dynamic adaptive optimizations.) — Jörg W Mittag, Jun 16 '19 at 06:43
Can you double-check the code you posted? For me, running ``multibfs(None, 250, 250)`` (the first argument is not used) finishes too fast to measure. Running ``multibfs(None, 9000, 9000)`` runs in roughly 2 seconds on CPython and slightly more on PyPy. — Armin Rigo, Jun 16 '19 at 07:08
@ArminRigo, thank you for pointing out. PS: None is not going to work for the first argument because it's meant to be a 2d list, hence is not subscriptable — hououin kyouma, Jun 16 '19 at 09:08
Why did you change the code? And why is the indentation wrong? Post only code that you have actually tested yourself. There is no point for us to look at code which might or might not exhibit the problem. — interjay, Jun 16 '19 at 09:13
@JörgWMittag this code returns a result in 0.6s on Python3 but takes roughly 11s to return the result in PyPy. — hououin kyouma, Jun 16 '19 at 09:26
@hououinkyouma: Huh? Are you aware that 11s is less than a minute? How can it return the result in 11s and at the same time run more than a minute without returning a result? — Jörg W Mittag, Jun 16 '19 at 10:11
@JörgWMittag, I changed the code a bit after posting the question — hououin kyouma, Jun 16 '19 at 13:01
Note that the bottleneck of this code is almost certainly `s = queue.pop(0)`... — Imperishable Night, Jun 16 '19 at 14:32
Thank you! Now I can run the code and verify that it's an order of magnitude slower on PyPy. I'll investigate. — Armin Rigo, Jun 17 '19 at 10:08
@ArminRigo instead of using pop(0), you can import deque from collections and use popleft(). That worked for me — hououin kyouma, Jun 18 '19 at 05:23
Yes, that's clearly the solution for *you* :-) But for me as implementer of PyPy I'm still wondering why pop(0) is ten times slower than on CPython. — Armin Rigo, Jun 19 '19 at 10:03
@ArminRigo I think pop(0) is an O(n) implementation in PyPy as opposed to the O(1) implementation of pop() and popleft() — hououin kyouma, Jun 19 '19 at 14:51
No, you were doing ``pop(0)`` on a list (not a deque) object. This is O(n) in CPython and PyPy. Of course ``popleft()`` on a deque is O(1) in CPython and PyPy. (The fact that your code becomes hundred of times faster, on CPython or PyPy, if we replace the list with a deque shows that clearly.) — Armin Rigo, Jun 20 '19 at 15:32

Why does the PyPy implementation run much slower than python?

0 Answers0