itertools product speed up

Question

I use itertools.product to generate all possible variations of 4 elements of length 13. The 4 and 13 can be arbitrary, but as it is, I get 4^13 results, which is a lot. I need the result as a Numpy array and currently do the following:

  c = it.product([1,-1,np.complex(0,1), np.complex(0,-1)], repeat=length)
  sendbuf = np.array(list(c))

With some simple profiling code shoved in between, it looks like the first line is pretty much instantaneous, whereas the conversion to a list and then Numpy array takes about 3 hours. Is there a way to make this quicker? It's probably something really obvious that I am overlooking.

Thanks!

You probably would gain a lot by implementing product for Numpy arrays... isn't there such a thing? Edit: see http://stackoverflow.com/questions/1208118 for a start. — TryPyPy, Jan 17 '11 at 02:29
How about using [`fromiter()`](http://docs.scipy.org/doc/numpy/reference/generated/numpy.fromiter.html#numpy.fromiter)? `sendbuf = np.fromiter(c, np.complex)` — Jeff Mercado, Jan 17 '11 at 02:32
the first line is instantaneous because it's just creating an iterator and doesn't actually generate the sequences. The sequences are created in the second line and generating 4^13 element will be slow. — kefeizhou, Jan 17 '11 at 02:36
What is the actual problem you're trying to solve? There's most likely solutions that won't require you to generating all 4^13 sequences. — kefeizhou, Jan 17 '11 at 02:43
Do you require a 13 by 4^13 2D array, or a 1D array of python tuple objects? — Paul, Jan 17 '11 at 05:35
I will look at fromiter, thanks! I also get the generator part and figured that's why it was so fast. As far as the actual problem, I do need all those combinations. Thanks though! And lastly, yes, I need a 4^13 by 13 array, though I reshape it along the way due to the way MPI works. — Christoph, Jan 17 '11 at 14:42
@kefeizhou just to clarify, yes, I need all those combinations. They are at the heart of my computation. Thanks though! — Christoph, Jan 17 '11 at 18:46

Sven Marnach · Accepted Answer · 2011-01-17T17:04:26.523

21

The NumPy equivalent of itertools.product() is numpy.indices(), but it will only get you the product of ranges of the form 0,...,k-1:

numpy.rollaxis(numpy.indices((2, 3, 3)), 0, 4)
array([[[[0, 0, 0],
         [0, 0, 1],
         [0, 0, 2]],

        [[0, 1, 0],
         [0, 1, 1],
         [0, 1, 2]],

        [[0, 2, 0],
         [0, 2, 1],
         [0, 2, 2]]],


       [[[1, 0, 0],
         [1, 0, 1],
         [1, 0, 2]],

        [[1, 1, 0],
         [1, 1, 1],
         [1, 1, 2]],

        [[1, 2, 0],
         [1, 2, 1],
         [1, 2, 2]]]])

For your special case, you can use

a = numpy.indices((4,)*13)
b = 1j ** numpy.rollaxis(a, 0, 14)

(This won't run on a 32 bit system, because the array is to large. Extrapolating from the size I can test, it should run in less than a minute though.)

EIDT: Just to mention it: the call to numpy.rollaxis() is more or less cosmetical, to get the same output as itertools.product(). If you don't care about the order of the indices, you can just omit it (but it is cheap anyway as long as you don't have any follow-up operations that would transform your array into a contiguous array.)

EDIT2: To get the exact analogue of

numpy.array(list(itertools.product(some_list, repeat=some_length)))

you can use

numpy.array(some_list)[numpy.rollaxis(
    numpy.indices((len(some_list),) * some_length), 0, some_length + 1)
    .reshape(-1, some_length)]

This got completely unreadable -- just tell me whether I should explain it any further :)

edited Jan 17 '11 at 17:04

answered Jan 17 '11 at 15:29

Sven Marnach

574,206
118
941
841

Hmm. I'd have to reshape though. But that's probably still faster. Thanks! Let me play with that a bit. – Christoph Jan 17 '11 at 15:50
2

@Christoph: Reshaping is cheap, as it doesn't change the data at all. – Sven Marnach Jan 17 '11 at 15:52
Just to clarify, the call to rollaxis is needed, isn't it? Otherwise I end up with a bunch of arrays that are all the same, 0..k. To get all the different combinations, I have to roll the axes. – Christoph Jan 17 '11 at 16:30
I have tried this and end up with the result after 3 minutes. That's almost 2 orders of magnitude faster! Thanks! I trust the results are the same as itertools, though I haven't formally tested it. In principle, they should be though. Thanks again! – Christoph Jan 17 '11 at 16:40
1

@Christoph: No, the call is not strictly necessary, it only brings the array in exactly the same shape as your original version. I don't know what you are going to do with the permutations, so I can't give advice how to do it. But I personally wouldn't bother using `rollaxis()` or `reshape()` at all, but rather use the `indices()` output as is. For example, you can iterate over all permutations using `for perm in izip(*(a.ravel() for a in numpy.indices(...)))`. Most probably you can also make do without reshaping for your application. – Sven Marnach Jan 17 '11 at 16:42
1

@Christoph - They are exactly the same results, even in the same order, for what it's worth. – Joe Kington Jan 17 '11 at 16:43
Hm. I get duplicates then or something. If I do e= np.indices((4,)*4), and then print e, I get [0,1,2,3] repeated many times over. Of course 4^4=256, and the result e has 1024 elements, so I get too many back. – Christoph Jan 17 '11 at 16:52
I should clarify: I get [0,1,2,3] repeated many times, but there are other elements. Just lots of duplicate [0,1,2,3]'s. – Christoph Jan 17 '11 at 16:55
1

@Christoph: If `a = numpy.indices((4,)*4)`, then `a[0]` will contain all the values for the first "index", `a[1]` will be the values for the second "index", and so on. For every `n` in the range 0 to 4^4-1, `a[0].flat[n], a[1].flat[n], a[2].flat[n], a[3].flat[n]` will be a valid permutation, with no duplicates. If you get confused by the ordering of the results, just use `rollaxis()`, but you don't have to. – Sven Marnach Jan 17 '11 at 17:01
Alright, my comment was quite nonsensical. 4^4=256 ELEMENTS, with each element being another array of size 4. My apologies, it works out just fine, and I get it now. You guys rock! – Christoph Jan 17 '11 at 17:21
@Sven Marnach: I have normal list**(contains strings not numbers)** not having numpy. And I am using itertools. Is it good to use itertools.product() instead of numpy.indices()? Or I have to convert my list into numpy to make cartesian product faster? – iNikkz Feb 26 '15 at 07:23
@iNikkz: It's not clear to me what you are trying to achieve. In general, Numpy won't buy you anything when dealing with lists of strings. – Sven Marnach Mar 02 '15 at 14:19
You can also use `numpy.stack(numpy.indices((2, 3, 3)), axis=-1)` instead of `rollaxis`. (It's a bit cleaner, and you won't need to change the axis if the dimensions change.) – Chris Nolet Aug 08 '17 at 21:43

score 6 · Answer 2 · edited May 23 '17 at 12:00

The first line seems instantaneous because no actual operation is taking place. A generator object is just constructed and only when you iterate through it as the operating taking place. As you said, you get 4^13 = 67108864 numbers, all these are computed and made available during your list call. I see that np.array takes only list or a tuple, so you could try creating a tuple out of your iterator and pass it to np.array to see if there is any performance difference and it does not affect the overall performance of your program. This can be determined only by trying for your usecase though there are some points which say tuple is slightly faster.

To try with a tuple, instead of list just do

sendbuf = np.array(tuple(c))

Thanks Senthil. If nothing else works, I will try with tuples. — Christoph, Jan 17 '11 at 16:35

Eric O. Lebigot · Answer 3 · 2011-01-26T22:06:24.333

6

You could speed things up by skipping the conversion to a list:

numpy.fromiter(c, count=…)  # Using count also speeds things up, but it's optional

With this function, the NumPy array is first allocated and then initialized element by element, without having to go through the additional step of a list construction.

PS: fromiter() does not handle the tuples returned by product(), so this might not be a solution, for now. If fromiter() did handle dtype=object, this should work, though.

PPS: As Joe Kington pointed out, this can be made to work by putting the tuples in a structured array. However, this does not appear to always give a speed up.

edited Jan 26 '11 at 22:06

answered Jan 17 '11 at 13:36

Eric O. Lebigot

91,433
48
218
260

Haven't tested it yet, but that was precisely what I was envisioning, thanks! – Christoph Jan 17 '11 at 14:41
1

@Christoph: I think that NumPy complains because `fromiter()` tries to create a *1D array* of *tuples* (this function only creates 1D arrays). So, a good choice of type would seem to be `dtype=object`… but NumPy refuses to do this. I'm not sure how to solve this problem (other than lobbying for having the possibility of using `dtype=object` :). – Eric O. Lebigot Jan 17 '11 at 15:19
& @Christoph - You can work around this by using a structured array and then viewing it as a "regular" array. Here's how: http://pastebin.com/tBLkzL64 Unfortunately, it's not any faster than just using a list or a tuple to do it, as you originally were. – Joe Kington Jan 17 '11 at 15:59

score 3 · Answer 4 · answered Dec 05 '15 at 00:00

3

Let numpy.meshgrid do all the job:

length = 13
x = [1, -1, 1j, -1j]
mesh = numpy.meshgrid(*([x] * length))
result = numpy.vstack([y.flat for y in mesh]).T

on my notebook it takes ~2 minutes

answered Dec 05 '15 at 00:00

Alleo

7,891
2
40
30

score 2 · Answer 5 · answered Jan 17 '11 at 15:43

2

You might want to try a completely different approach: first create an empty array of the desired size:

result = np.empty((4**length, length), dtype=complex)

then use NumPy's slicing abilities to fill out the array yourself:

# Set up of the last "digit":
result[::4, length-1] = 1
result[1::4, length-1] = -1
result[2::4, length-1] = 1j
result[3::4, length-1] = -1j

You can do similar things for the other "digits" (i.e. the elements of result[:, 2], result[:, 1], and result[:, 0]). The whole thing could certainly be put in a loop that iterates over each digit.

Transposing the whole operation (np.empty((length, 4**length)…)) is worth trying, as it might bring a speed gain (through a better use of the memory cache).

answered Jan 17 '11 at 15:43

Eric O. Lebigot

91,433
48
218
260

This doesn't appear to give the correct answer (even when you use a loop to do it for each "digit")... If it did, it would be the fastest version so far... Did I understand what you meant correctly? Here's how I implemented this version: http://pastebin.com/VpwXVkQs – Joe Kington Jan 17 '11 at 16:36
@Joe: I meant something different from your implementation, namely that the before-last digits `result[:, length-1]` should be set to [1, 1, 1, 1, -1, -1, -1, -1, 1j, 1j, etc.]. And similarly for the subsequent digits (with repeated sequences of increasing size). Is this clearer? :) – Eric O. Lebigot Jan 17 '11 at 17:25
@Joe: What I was suggesting is essentially what `numpy.indices()` does; the difference is that I was setting coordinates manually, but directly to the values requested in the question. – Eric O. Lebigot Jan 21 '11 at 09:48

score 1 · Answer 6 · answered Jan 17 '11 at 06:53

Probably not optimized but much less reliant on python type conversions:

ints = [1,2,3,4]
repeat = 3

def prod(ints, repeat):
    w = repeat
    l = len(ints)
    h = l**repeat
    ints = np.array(ints)
    A = np.empty((h,w), dtype=int)
    rng = np.arange(h)
    for i in range(w):
        x = l**i
        idx = np.mod(rng,l*x)/x
        A[:,i] = ints[idx]
    return A

itertools product speed up

6 Answers6

Let numpy.meshgrid do all the job:

Linked

Related