Python: slow nested for loop

Question

I need to find out an optimal selection of media, based on certain constraints. I am doing it in FOUR nested for loop and since it would take about O(n^4) iterations, it is slow. I had been trying to make it faster but it is still damn slow. My variables can be as high as couple of thousands.

Here is a small example of what I am trying to do:

    max_disks = 5
    max_ssds = 5
    max_tapes = 1
    max_BR    = 1
    allocations = []
    for i in range(max_disks):
     for j in range(max_ssds):
        for k in range(max_tapes):
            for l in range(max_BR):
                allocations.append((i,j,k,l)) # this is just for example. In actual program, I do processing here, like checking for bandwidth and cost constraints, and choosing the allocation based on that.

It wasn't slow for up to hundreds of each media type but would slow down for thousands.

Other way I tried is :

    max_disks = 5
    max_ssds = 5
    max_tapes = 1
    max_BR    = 1

    allocations = [(i,j,k,l) for i in range(max_disks) for j in range(max_ssds) for k in range(max_tapes) for l in range(max_BR)]

This way it is slow even for such small numbers.

Two questions:

Why the second one is slow for small numbers?
How can I make my program work for big numbers (in thousands)?

Here is the version with itertools.product

            max_disks = 500
            max_ssds = 100
            max_tapes = 100
            max_BR    = 100
            # allocations = []
            for i, j, k,l in itertools.product(range(max_disks),range(max_ssds),range(max_tapes),range(max_BR)):
                pass

It takes 19.8 seconds to finish with these numbers.

The first example with a list comprehension is *faster* than the second example. They are otherwise equivalent, but the `allocations.append` attribute lookup and subsequent method call slow down the nested loop. You probably want to look at `itertools.product()` here instead and avoid creating a huge list object with all possible combinations (process the items one by one instead). — Martijn Pieters, Sep 19 '16 at 08:58
I tried itertools.product() too. But that also did not work for thousands. — , Sep 19 '16 at 09:01
Do you insist on building a list of allocations? You already know the general structure of the list you're building, so can't you process the allocations individually? — Nelewout, Sep 19 '16 at 09:08
Adding to the list I used here just for example. Actually in the innermost loop, I am doing operations like checking the bandwidth the allocation provides, and take decision of keeping the allocation or not. — , Sep 19 '16 at 09:13
@Pretty: 'did not work' tells us nothing however. `itertools.product()` is faster still than either method you posted, and unbeatable if you don't materialise everything into a list first. Perhaps you want to show *what else* you are doing with the combinations you produce? Why do you need to produce this large a number of tuples in a list in the first place? — Martijn Pieters, Sep 19 '16 at 09:33

score 3 · Answer 1 · edited May 23 '17 at 12:00

From the comments, I got that you're working on a problem that can be rewritten as an ILP. You have several constraints, and need to find a (near) optimal solution.

Now, ILPs are quite difficult to solve, and brute-forcing them quickly becomes intractable (as you've already witnessed). This is why there are several really clever algorithms used in the industry that truly work magic.

For Python, there are quite a few interfaces that hook-up to modern solvers; for more details, see e.g. this SO post. You could also consider using an optimizer, like SciPy optimize, but those generally don't do integer programming.

score 0 · Answer 2 · answered Sep 19 '16 at 09:24

Doing any operation in Python a trillion times is going to be slow. However, that's not all you're doing. By attempting to store all the trillion items in a single list you are storing lots of data in memory and manipulating it in a way that creates a lot of work for the computer to swap memory in and out once it no longer fits in RAM.

The way that Python lists work is that they allocate some amount of memory to store the items in the list. When you fill up the list and it needs to allocate more, Python will allocate twice as much memory and copy all the old entries into the new storage space. This is fine so long as it fits in memory - even though it has to copy all the contents of the list each time it expands the storage, it has to do so less frequently as it keeps doubling the size. The problem comes when it runs out of memory and has to swap unused memory out to disk. The next time it tries to resize the list, it has to reload from disk all the entries that are now swapped out to disk, then swap them all back out again to get space to write the new entries. So this creates lots of slow disk operations that will get in the way of your task and slow it down even more.

Do you really need to store every item in a list? What are you going to do with them when you're done? You could perhaps write them out to disk as you're going instead of accumulating them in a giant list, though if you have a trillion of them, that's still a very large amount of data! Or perhaps you're filtering most of them out? That will help.

All that said, without seeing the actual program itself, it's hard to know if you have a hope of completing this work by an exhaustive search. Can all the variables be on the thousands scale at once? Do you really need to consider every combination of these variables? When max_disks==2000, do you really need to distinguish the results for i=1731 from i=1732? For example, perhaps you could consider values of i 1,2,3,4,5,10,20,30,40,50,100,200,300,500,1000,2000? Or perhaps there's a mathematical solution instead? Are you just counting items?

Python: slow nested for loop

2 Answers2