1

I'm trying to generate a list of number sequences using python, like below.

[0,0,0,0] [0,0,0,1] [0,0,0,2] [0,0,1,1] [0,0,1,2] [0,0,2,2] [0,1,1,1]
[0,1,1,2] [0,1,2,2] [0,2,2,2] [1,1,1,1] [1,1,1,2] ... [2,2,2,2]

Right now, I can do this using pure python with a recursive call, but it takes a lot of time for a single run (a few hours). I wonder if it's possible to do this with numpy and save a huge amount of time, and if yes, how?

Praveen
  • 6,872
  • 3
  • 43
  • 62
X.Z
  • 1,018
  • 2
  • 11
  • 16
  • Take time to read this post on how to compose a good SO question: http://stackoverflow.com/help/how-to-ask Hint: Include code you have tried. – garfbradaz Dec 29 '16 at 13:48

2 Answers2

2

do you mean this (or how is the sequence you mean defined)?

from itertools import product

for item in product((0, 1, 2), repeat=4):
    print(item)

this prints:

(0, 0, 0, 0)
(0, 0, 0, 1)
(0, 0, 0, 2)
(0, 0, 1, 0)
(0, 0, 1, 1)
(0, 0, 1, 2)
...
(2, 2, 1, 2)
(2, 2, 2, 0)
(2, 2, 2, 1)
(2, 2, 2, 2)

not sure if that is what you are looking for, but product is included in python.

this should be fast and memory-efficient. the lists are created on demand.

...on second thought: this is probably what you mean, right?

for a, b, c, d in product((0, 1, 2), repeat=4):
    if not a <= b <= c <= d:
        continue
    print(a,b,c,d)

with output:

0 0 0 0, 0 0 0 1, 0 0 0 2, 0 0 1 1, 0 0 1 2, 0 0 2 2, 0 1 1 1, 
0 1 1 2, 0 1 2 2, 0 2 2 2, 1 1 1 1, 1 1 1 2, 1 1 2 2, 1 2 2 2, 
2 2 2 2, 

and now i see how you would want that more efficient...

looks like Praveen's answer provides just that.

Community
  • 1
  • 1
hiro protagonist
  • 44,693
  • 14
  • 86
  • 111
2

What you're looking for is itertools.combinations_with_replacement. From the docs:

combinations_with_replacement('ABCD', 2) AA AB AC AD BB BC BD CC CD DD

Hence:

>>> import itertools as it
>>> list(it.combinations_with_replacement((0, 1, 2), 4))
[(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 0, 2),
 (0, 0, 1, 1), (0, 0, 1, 2), (0, 0, 2, 2),
 (0, 1, 1, 1), (0, 1, 1, 2), (0, 1, 2, 2),
 (0, 2, 2, 2), (1, 1, 1, 1), (1, 1, 1, 2),
 (1, 1, 2, 2), (1, 2, 2, 2), (2, 2, 2, 2)]

The best part of this method is that since it returns a generator, you can iterate over it without storing it. This is a huge plus, since it'll save you a lot of memory.

Other implementations and timing

Here are some more implementations, including a numpy implementation. combinations_with_replacement (the try2 function) appears to be the fastest:

import itertools as it
import timeit

import numpy as np

def try1(n, m):
    return [t for t in it.product(range(n), repeat=m) if all(a <= b for a, b in zip(t[:-1], t[1:]))]

def try2(n, m):
    return list(it.combinations_with_replacement(range(n), m))

def try3(n, m):
    a = np.mgrid[(slice(0, n),) * m] # All points in a 3D grid within the given ranges
    a = np.rollaxis(a, 0, m + 1)     # Make the 0th axis into the last axis
    a = a.reshape((-1, m))           # Now you can safely reshape while preserving order
    return a[np.all(a[:, :-1] <= a[:, 1:], axis=1)]

>>> %timeit b = try1(3, 4)
10000 loops, best of 3: 78.1 µs per loop
>>> %timeit b = try2(3, 4)
The slowest run took 8.04 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.66 µs per loop
>>> %timeit b = try3(3, 4)
10000 loops, best of 3: 97.8 µs per loop

This is true even for bigger numbers:

>>> %timeit b = try1(3, 6)
1000 loops, best of 3: 654 µs per loop
>>> %timeit b = try2(3, 6)
The slowest run took 7.20 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 2.33 µs per loop
>>> %timeit b = try3(3, 6)
10000 loops, best of 3: 166 µs per loop

Notes:

  • I was using python3
  • I used this answer for the implementation of try1.
  • I used this answer for the implementation of try3.
Community
  • 1
  • 1
Praveen
  • 6,872
  • 3
  • 43
  • 62