So I find myself taking advantage of heapq
for some calculations. However, for the problem I am working on, it runs slow because the heap gets pretty big.
I thought I had an option to speed it up. Rather than creating a giant heap, make it a heap of heaps. But, surprisingly to me, the "more efficient" code is significantly slower. There's a bit more overhead in that more efficient code, but I really thought it would win by a lot. Having stripped down the problem, I've got two functions that do the same net calculation. f1
is the "naive" (and faster) version. f2
is the "improved" (but slower) version. I do some random number generation in both, but I use the same seed, so it really is the same thing.
import random
import heapq
def f1():
random.seed(1)
Q=[0]
while Q:
value = heapq.heappop(Q)
#print value
if value<0.5:
for counter in range(16):
heapq.heappush(Q,value + 0.1 + (random.random()/1000))
print value
def f2():
random.seed(1)
Q=[[0]]
while Q:
subQ = heapq.heappop(Q)
value = heapq.heappop(subQ)
#print value
if subQ:
heapq.heappush(Q,subQ)
newQ = []
if value<0.5:
for counter in range(16):
newQ.append(value + 0.1 + (random.random()/1000))
heapq.heapify(newQ)
heapq.heappush(Q,newQ)
print value
Why does the heap of heaps (f2
) run significantly slower? It should call heappush
the same number of times, and heappop twice as many times. But the size of the heaps should be significantly smaller, so I expected it to run faster.