Why is my (newbie) code so slow?

Question

I'm learning python and came across this example of a simulation of a model I've seen before. One of the functions looked unnecessarily long, so I thought it would be good practice to try to make it more efficient. My attempt, while requiring less code, is about 1/60th as fast. Yes, I made it 60 times worse.

My question is, where have I gone wrong? I've tried timing individual parts of the function and don't see where the bottleneck is.

Here's the original function. It's for a model where people live on a grid and their happiness depends on whether they're the same race as most of their neighbors. (It's Schelling's segregation model.) So we give it an x,y coordinate for a person and determine their happiness by checking the race of each of their neighbors.

def is_unhappy(self, x, y):

    race = self.agents[(x,y)]
    count_similar = 0
    count_different = 0

    if x > 0 and y > 0 and (x-1, y-1) not in self.empty_houses:
        if self.agents[(x-1, y-1)] == race:
            count_similar += 1
        else:
            count_different += 1
    if y > 0 and (x,y-1) not in self.empty_houses:
        if self.agents[(x,y-1)] == race:
            count_similar += 1
        else:
            count_different += 1
    if x < (self.width-1) and y > 0 and (x+1,y-1) not in self.empty_houses:
        if self.agents[(x+1,y-1)] == race:
            count_similar += 1
        else:
            count_different += 1
    if x > 0 and (x-1,y) not in self.empty_houses:
        if self.agents[(x-1,y)] == race:
            count_similar += 1
        else:
            count_different += 1        
    if x < (self.width-1) and (x+1,y) not in self.empty_houses:
        if self.agents[(x+1,y)] == race:
            count_similar += 1
        else:
            count_different += 1
    if x > 0 and y < (self.height-1) and (x-1,y+1) not in self.empty_houses:
        if self.agents[(x-1,y+1)] == race:
            count_similar += 1
        else:
            count_different += 1        
    if x > 0 and y < (self.height-1) and (x,y+1) not in self.empty_houses:
        if self.agents[(x,y+1)] == race:
            count_similar += 1
        else:
            count_different += 1        
    if x < (self.width-1) and y < (self.height-1) and (x+1,y+1) not in self.empty_houses:
        if self.agents[(x+1,y+1)] == race:
            count_similar += 1
        else:
            count_different += 1

    if (count_similar+count_different) == 0:
        return False
    else:
        return float(count_similar)/(count_similar+count_different) < self.similarity_threshold

And here is my code, which, as I said, is MUCH slower. I wanted to avoid having all the if statements above by just creating a list of "offsets" to add to each person's coordinates to determine the locations of possible neighbors, check if that is a valid position, and then check the neighbor's race.

def is_unhappy2(self, x, y):
    thisRace = self.agents[(x,y)]
    count_same = 0
    count_other = 0

    for xo, yo in list(itertools.product([-1,0,1],[-1,0,1])):
        if xo==0 and yo==0:
            # do nothing for case of no offset
            next
        else:
            # check if there's a neighbor at the offset of (xo, yo)
            neighbor = tuple(np.add( (x,y), (xo,yo) ))
            if neighbor in self.agents.keys():
                if self.agents[neighbor] == thisRace:
                    count_same += 1
                else:
                    count_other += 1
    if count_same+count_other == 0:
        return False
    else:
        return float(count_same) / (count_same + count_other) < self.similarity threshold

(The rest of the code that creates the class is on the site where the example comes from.)

Here are the timing results:

%timeit s.is_unhappy2(49,42)
100 loops, best of 3: 5.99 ms per loop

%timeit s.is_unhappy(49,42)
10000 loops, best of 3: 103 µs per loop

I'm hoping someone with python knowledge can see immediately what I'm doing wrong without having to get into the nitty-gritty of the rest of the code. Can you see why my code is so much worse than the original?

Is yours slower? 100 loops at 5.99 ms per loop is 599 ms total, 10000 loops at 103 microseconds is 1030 ms total. Perhaps I'm confused by the timing results? — ryan, May 06 '15 at 20:30
@ryan The time per loop is what matters, not the overall time spent running the benchmark. The number of iterations just gives you an idea of a confidence interval. — John Kugelman, May 06 '15 at 21:04
@TylerH With 9 upvotes and no close votes until now, it seems like this is an acceptable question on Stack Overflow. Some questions can be on-topic on both sites. It is also not a very good Code Review question as it includes a bit about understanding some other code. — Simon Forsberg, May 07 '15 at 20:32
*Sigh*, then please use an actual close reason. Being on topic somewhere else doesn't make it off topic here. Saying it belongs somewhere else is not a close reason @TylerH. — RubberDuck, May 07 '15 at 20:36
@RubberDuck Unfortunately we are not capable of changing our CV reasons even in light of new information, so I'm unable to do that. — TylerH, May 07 '15 at 20:37
You can depend on people to say, from eyeballing the code, that "the problem is Foo". They might be right. The probability they are right is something, perhaps as much as 20%. Can you depend on that? The way to find out what takes time is not eyeballing, but [*this method*](http://stackoverflow.com/a/4299378/23771), and it's never wrong. — Mike Dunlavey, May 13 '15 at 18:22

Stefan Pochmann · Accepted Answer · 2015-05-07T14:59:58.073

Don't use np.add, just use neighbor = (x+xo, y+yo). That should make it much faster (10x faster in my little test).

You can also...

ask if neighbor in self.agents: without the .keys()
leave out the list
check xo or yo and not have an empty if-block
avoid the double-lookup of neighbor in self-agents

Result:

for xo, yo in itertools.product([-1,0,1],[-1,0,1]):
    if xo or yo:
        neighbor = self.agents.get((x+xo, y+yo))
        if neighbor is not None:
            if neighbor == thisRace:
                count_same += 1
            else:
                count_other += 1

And you can add

self.neighbor_deltas = tuple(set(itertools.product([-1,0,1],[-1,0,1])) - {(0, 0)})

to the class initializer and then your function can just use those pre-computed deltas:

for xo, yo in self.neighbor_deltas:
    neighbor = self.agents.get((x+xo, y+yo))
    if neighbor is not None:
        if neighbor == thisRace:
            count_same += 1
        else:
            count_other += 1

Congrats for deciding to improve that author's ridiculously repetitive code, btw.

score 3 · Answer 2 · answered May 06 '15 at 21:03

3

The culprit looks to be this line:

neighbor = tuple(np.add( (x,y), (xo,yo) ))

Changing it to this shows a huge speedup:

neighbor = (x + xo, y + yo)

answered May 06 '15 at 21:03

John Kugelman

349,597
67
533
578

Why is my (newbie) code so slow?

2 Answers2