1

I have 6 variables with different ranges. I want to create possibilities pool with my code. In this example i gave 10 range for every variable but i have to give them about 200 range. But whenever i'm trying to exceed 20 range (for example 30 range) Python kills itself, and sometimes it freezes computer. Is there anyway to make it faster and stable?

Thanks.

import itertools

a = [x for x in range(400,411)]
b = [x for x in range(400,411)]
c = [x for x in range(400,411)]
d = [x for x in range(400,411)]
e = [x for x in range(400,411)]
f = [x for x in range(400,411)]

fl = lambda x: x

it = filter(fl, itertools.product(a,b,c,d,e,f))

posslist = [x for x in it]

print(len(posslist))
PM 2Ring
  • 54,345
  • 6
  • 82
  • 182
  • What is `fl = lambda x: x` supposed to be doing? Also you do realise just how many products you are making? – Padraic Cunningham Oct 16 '16 at 12:34
  • 3
    `200**6` is 64 trillion. Of course it's going to take a long time, even if you get rid of the obvious inefficiencies in your current code. – PM 2Ring Oct 16 '16 at 12:36
  • 1
    I don't think you can expect it to work well with 200 elements in each of the six lists. If you leave the `it` as a generator, not a list, you will at least be able to continue. – Bendik Oct 16 '16 at 12:36
  • *“Python kills itself, and sometimes it freezes computer”* – It doesn’t kill itself. It is *working*. You have to let it do its job—which obviously can take a long while. Unless your filter `fl` is (a lot) more complex than what you shown, you could probably calculate the number of combinations that match your condition. – poke Oct 16 '16 at 12:38
  • FWIW, running `print(sum(1 for t in product(range(20), repeat=6)))` on Python 3.6 takes about 44 seconds to print 64000000 on my old 2GHz 32 bit machine. – PM 2Ring Oct 16 '16 at 12:44
  • @PM2Ring And 44 seconds should be multiplied with 1 million for range(200), right? – ayhan Oct 16 '16 at 12:48
  • PadraicCunningham i left it there because i would add something further. PM2Ring i know it's gonna take days, maybe weeks. i want to make it faster also. poke actually it kills itself 'Process finished with exit code 137 (interrupted by signal 9: SIGKILL)'. Bendik i'll try generator thanks. – Korhan Yüzbaş Oct 16 '16 at 12:48
  • @PM2Ring it takes 14 seconds in my computer. but still freezes with 30 range – Korhan Yüzbaş Oct 16 '16 at 12:51
  • 1
    @ayhan It will be slower than that, because most of the sums will be over the small integer limit, so the arithmetic will have to use Python's arbitrary precision integers. But otherwise, you are correct. And 44 million seconds is is a little over 509 days. – PM 2Ring Oct 16 '16 at 12:53
  • What version of Python are you using? Some of the functions you're calling behave differently in Python 2 vs Python 3. – PM 2Ring Oct 16 '16 at 12:54
  • @PM2Ring i'm using Python3.5 – Korhan Yüzbaş Oct 16 '16 at 12:56
  • does my approach sound good? If not please suggest how can I improve my answer – kmario23 Oct 17 '16 at 00:33

3 Answers3

5

There are 6 lists of 11 elements each: [400, 401, 402, 403, 404, 405, 406, 407, 408, 409, 410].

A cartesian product of 6 such lists is a list of 116 tuples of 6 integers each (the first tuple is: (400, 400, 400, 400, 400, 400)).

The size of each tuple is 6*8 bytes in 64-bit Python*.

So the total size of posslist is 6 * 8 * 116 = 81 GB!

Do you have enough RAM for that? Probably not, so the OS is going to start swapping RAM, which is extremely slow. Therefore, in addition to calculating 81 GB of data, thecomputer will have to constantly swap data from RAM to HDD and back, so it will do the job even slower.


* Note that while it is half that size in a 32-bit Python, a 32-bit Python cannot address enough memory at all

zvone
  • 18,045
  • 3
  • 49
  • 77
0

Use the cartesian_product function from here  for an almost 60x speed up from itertools.product()

def cartesian_product2(arrays):

    la = len(arrays)
    arr = np.empty([len(a) for a in arrays] + [la])
    for i, a in enumerate(np.ix_(*arrays)):
        arr[...,i] = a

    return arr.reshape(-1, la)
Community
  • 1
  • 1
R. S. Nikhil Krishna
  • 3,962
  • 1
  • 13
  • 27
0

Maybe instead of doing all of it one-go, you can try doing the product in successive steps. For example,

import itertools as itt

a = [1, 2, 3]
b = [4, 5, 6]
c = [7, 8, 9]

fl = lambda x: x

s1 = filter(fl, itt.product(a, b))
s2 = filter(fl, itt.product(s1, c))

print(list(s2))

But then the result would look like the following. You just have to unpack the tuple of tuple to a single tuple.

[((1, 4), 7),
 ((1, 4), 8),
 ((1, 4), 9),
 ((1, 5), 7),
 ((1, 5), 8),
 ((1, 5), 9),
 ((1, 6), 7),
 ((1, 6), 8),
 ((1, 6), 9),
 ((2, 4), 7),
 ((2, 4), 8),
 ((2, 4), 9),
 ((2, 5), 7),
 ((2, 5), 8),
 ((2, 5), 9),
 ((2, 6), 7),
 ((2, 6), 8),
 ((2, 6), 9),
 ((3, 4), 7),
 ((3, 4), 8),
 ((3, 4), 9),
 ((3, 5), 7),
 ((3, 5), 8),
 ((3, 5), 9),
 ((3, 6), 7),
 ((3, 6), 8),
 ((3, 6), 9)]

I also think that this can be made parallelizable where you can do the product of the lists in parallel.

For lists L1, L2, L3, L4, do something like:

th1_res = itertools.product(L1, L2)
th2_res = itertools.product(L3, L4)

thread_final = itertools.product(th1_res, th2_res)
kmario23
  • 57,311
  • 13
  • 161
  • 150
  • Thanks for your answer, but best solution i found yet is using generator instead of list. Your answer will cause RAM problem as well as my code. – Korhan Yüzbaş Oct 17 '16 at 11:51
  • Note that everything is generator in Python 3.5. I'm just casting it to a list just for printing. – kmario23 Oct 17 '16 at 18:06
  • If you check itertools.product on Python 3, you will see that product returns as list. Documentation say this too, that's why they gave an example. You can check my answer. – Korhan Yüzbaş Oct 17 '16 at 22:17