How to speed up python loop

Question

I had a look at several dicussions on several sites and none of them gave me a solution. This piece of code takes more than 5 seconds to run :

for i in xrange(100000000):
  pass

I'm working on an integer optimization problem and I have to use an O(n log n) algorithm edit : an O(n²/4) algorithm, where n stands for all the matrix' items, ie, in the following code, n * m = 10000. So, for a matrix 100 * 100 with 10000 elements, it will result in nearly 25000000 iterations .... Its code can be summed up like this :

m = 100
n = 100
for i in xrange(m):
  for j in xrange(n):
    for i2 in xrange(i + 1, m):
      for j2 in xrange(j + 1, n):
        if myarray[i][j] == myarray[i2][j2] and myarray[i2][j] == myarray[i][j2]:
          return [i, j], [i2, j2]

Should I give up with Python and go back to Java or to C ?

I work with Python 2.7 and Psyco isn't available. PyPy doesn't support Tkinter out of the box and I'm using Tkinter.

So, would they improve the looping speed ? Are there other solutions ?

did you try to implement this with numpy arrays if that is possible ? — joaquin, Jan 07 '12 at 15:38
You can speed up your loops all you want, the code above is not O(n log n). — John Percival Hackworth, Jan 07 '12 at 15:44
And pypy would speed things up, but by a factor of 4-5. Such a loop should take less than 0.5 sec on a decent computer when written in c. — s_xavier, Jan 07 '12 at 16:42
It looks like this algorithm is n^2*m^2, and there's not a lot of optimization you can do to speed it up in a particular language. You would benefit *far more* from reexamining your algorithm and seeing if there are any unnecessary iterations. — Makoto, Jan 07 '12 at 18:27
Yes but as Paul McGuire figured out, I'm looking for rectangles and I can't do less iteration... This algorithm would run faster in C or in Java because a loop iteration doesn't need so many CPU cycles like in Python ... — s_xavier, Jan 07 '12 at 18:35
Just to be clear what your asking: Your algorithm is fix and you won't change it, and want to know which language suites your task best? Or do you want a optimized version of your algorithm? — Don Question, Jan 07 '12 at 22:19
It is fixed and I want to know how to make it faster in Python. I starded using cython this afternoon. It is part of a bigger program but the rest of the code doesn't need optimization. — s_xavier, Jan 08 '12 at 00:12

PaulMcG · Answer 1 · 2018-04-08T19:06:46.823

Not the prettiest coding style, but desperate times call for desperate coding. Try turning your nested nested loops into one big generator expression:

try:
    i,j,i2,j2 = ((i,j,i2,j2)
        for i in xrange(m)
          for j in xrange(n)
            for i2 in xrange(i + 1, m)
              for j2 in xrange(j + 1, n)
                if myarray[i][j] == myarray[i2][j2] and 
                    myarray[i2][j] == myarray[i][j2]).next()
    return [i,j],[i2,j2]
except StopIteration:
    return None

Updated to use builtins next and product, and Py3 range instead of xrange:

from itertools import product
match = next(((i,j,i2,j2)
    for i, j in product(range(m), range(n))
        for i2, j2 in product(range(i+1, m), range(j+1, n))
            if myarray[i][j] == myarray[i2][j2] and 
                myarray[i2][j] == myarray[i][j2]), None)
if match is not None:
    i,j,i2,j2 = match
    return [i,j],[i2,j2]
else:
    return None

Can't upvote at the moment, need 5 more reputation points : / — s_xavier, Jan 08 '12 at 15:33

score 13 · Answer 2 · answered Jan 07 '12 at 16:08

13

You can still use the python notation, and have the speed of C, using the Cython project. A first step would be to convert the loop in C loop: it's automatically done by typing all the variables used in the loop:

cdef int m = 100
cdef int n = 100
cdef int i, j, i2, j2
for i in xrange(m):
  for j in xrange(n):
    for i2 in xrange(i + 1, m):
      for j2 in xrange(j + 1, n):

For The next part, it would be even better if that myarray would be pure C array, then no rich python comparaison nor array access. With your python array, you can remove the rich comparaison by doing native comparaison (if you have double in your array):

        cdef double a, b, c, d
        a = myarray[i][j]
        b = myarray[i2][j2]
        c = myarray[i2][j]
        d = myarray[i][j2]

        if a == b and c == d:
          return [i, j], [i2, j2]

You can see how optimization are going by running cython -a yourfile.pyx, then open the yourfile.html generate. You'll see how Cython have optimized your code, and removed Python overhead.

answered Jan 07 '12 at 16:08

tito

12,990
1
55
75

Ok, thanks, I will try cython cause I need a big boost of speed. – s_xavier Jan 07 '12 at 16:43
This page detail the benefits from cython with a problem similar to the one I'm dealing with : http://www.korokithakis.net/posts/optimizing-python-with-cython/ – s_xavier Jan 07 '12 at 18:31
I started using cython and numpy for the arrays. Loops are really faster. But I got a problem : I have function calls that slow down the loops, so I've got a lot of work to do with cython. – s_xavier Jan 08 '12 at 09:35
function call in python are expensive. Learn Cython more, you'll see how awesome it is :) – tito Jan 08 '12 at 10:20

score 2 · Answer 3 · edited May 23 '17 at 12:25

2

Sorry to tell you, but it's not a matter of optimization. No matter which language or implementation you choose, this algorithm is not O(n*log n) in the worst- and average case. You might want to read up on how the Big-O notation works and develop a better algorithm.

edited May 23 '17 at 12:25

Community

1
1

answered Jan 07 '12 at 15:48

Niklas B.

92,950
18
194
224

Ok it is O(n²/4), not O(n log n), but there is no other algorithms and this is an optimisation problem (see edit in the main post for details). – s_xavier Jan 07 '12 at 17:16

PaulMcG · Answer 4 · 2012-01-07T17:52:30.747

1

Sorry the generator expr didn't work. Here is a different scheme that first tallies all like values, and then looks for rectangular groups:

from collections import defaultdict
from itertools import permutations
def f3(myarray):
    tally = defaultdict(list)
    for i,row in enumerate(myarray):
        for j,n in enumerate(row):
            tally[n].append((i,j))

    # look for rectangular quads in each list
    for k,v in tally.items():
        for quad in permutations(v,4):
            # sort quad so that we can get rectangle corners 
            quad = sorted(quad)
            (i1,j1),(i2,j2) = quad[0], quad[-1]

            # slice out opposite corners to see if we have a rectangle
            others = quad[1:3]

            if [(i1,j2),(i2,j1)] == others:
                return [i1,j1],[i2,j2]

edited Jan 07 '12 at 17:52

answered Jan 07 '12 at 17:41

PaulMcG

62,419
16
94
130

thanks, I tryed it and learned some new things about Python syntax, but it didn't run faster than the other things. – s_xavier Jan 07 '12 at 18:00
If you have a sparse array, and you only want matching values that are nonzero, then you can skip all of the permutation testing of the zero cells by preceding the `for quad in permutations(v,4):` statement with `if k == 0: continue`. – PaulMcG Jan 07 '12 at 19:53

score 1 · Answer 5 · edited May 23 '17 at 12:25

I looked several times across your code and if i get it right, your marking your rectangles with two different corner-markers. I'm sorry if my answer is more a comment to clarify your position then a really answer.

Answer-Part:: If your looking for an appropriate algorithm, consider a scan-line algorithm. I found an example here @SO for a "biggest rectangle solution".

My Question to you is, what are you really looking for?

best way to solve the for-loop dilemma
best language for your algorithm
a faster algorithm to find rectangles

I must also point out, that Paul's and your solution produce different results, because Pauls assumes corners are marked by same values and you assume, that corners are marked through two different values!

I took the time and liberty to illustrate it with a ugly c&p script: It compares both algorithm by creating a 2D-field and fills it with string.lowercase chars and turns the corners to uppercase-chars.

from random import choice
from string import lowercase
from collections import defaultdict
from itertools import permutations
from copy import deepcopy

m,n = 20, 20
myarray = [[choice(lowercase) for x in xrange(m)] for y in xrange(n)]

def markcorners(i,j,i2,j2):
    myarray[i][j] = myarray[i][j].upper()
    myarray[i2][j] = myarray[i2][j].upper()
    myarray[i2][j2] = myarray[i2][j2].upper()
    myarray[i][j2] = myarray[i][j2].upper()

def printrect():
    for row in myarray:
        print ''.join(row)
    print '*'*m

def paul_algo():
    tally = defaultdict(list)
    for i,row in enumerate(myarray):
        for j,n in enumerate(row):
            tally[n].append((i,j))

    # look for rectangular quads in each list
    for k,v in tally.items():
        for quad in permutations(v,4):
            # sort quad so that we can get rectangle corners 
            quad = sorted(quad)
            (i1,j1),(i2,j2) = quad[0], quad[-1]

            # slice out opposite corners to see if we have a rectangle
            others = quad[1:3]

            if [(i1,j2),(i2,j1)] == others:
                markcorners(i1,j1,i2,j2)

def xavier_algo():   
    for i in xrange(m):
        for j in xrange(n):
            for i2 in xrange(i + 1, m):
                for j2 in xrange(j + 1, n):
                    if myarray[i][j] == myarray[i2][j2] and myarray[i2][j] == myarray[i][j2]:
                        markcorners(i,j,i2,j2)

savearray = deepcopy(myarray)
printrect()

xavier_algo()
printrect()

myarray = deepcopy(savearray)
paul_algo()
printrect()

Hi. Thanks for your answer. I need the opposite corners values to be different. A better algorithm would be nice. I will have a look at the one you point out tommorow. Got to sleep now cause it's kinda late here. — s_xavier, Jan 08 '12 at 00:32
I had a look at the algo you suggested. Well thought, but it doesn't fit my needs which are particular. I don't need to find the biggest rectangle but I have to find all of them with opposite corners that got the same value. It is used in a neighbourhood function in a tabu search for a discrete optimization problem. — s_xavier, Jan 08 '12 at 09:32

How to speed up python loop

5 Answers5

Linked