0

For example, say that Dimension = 3, and N = 4, I'm trying to generate the following array:

[ [0,0,0], [0,0,1], [0,0,2], [0,0,3], [0,1,0], [0,1,1], [0,1,2], [0,1,3], [1,0,0], ... ... ..., [3,3,3] ].

(basically counting on base N)

Another example, if Dimension = 2 and N = 5, the output would be:

[ [0,0], [0,1], [0,2], [0,3], [0,4], [1,0], [1,1], [1,2], [1,3], [1,4], [2,0], [2,1], ... ... ..., [4,4] ].

I'm currently doing this successfully with the following code:

ParameterSpace = [ [int(i/N**j)%N for j in range(Dimension)][::-1] for i in range(N**Dimension) ]

The problem, however, is that it consumes lot of time and memory when I try for 'big' N and Dimension (in particular I can't get an answer when N = 16 and Dimension = 7 because it blows my memory of 16Gb)

So I was wondering if there is a more efficient way of doing it.

Thanks in advance.

L. B.
  • 430
  • 3
  • 14

3 Answers3

1

You can use product from itertools package to create a generator object

from itertools import product
x = [0, 1, 2, 3, 4]
a = product(x, repeat=2)
#next(a) will print (0, 0) and so on until it's exhausted
Osman Mamun
  • 2,864
  • 1
  • 16
  • 22
1

If for some reason you need the entire table you can use uint8 dtype and a simplified version of this cartesian product code.

import numpy as np
from itertools import chain, repeat, accumulate

def cartesian_power(N, D):
    dtype = f'u{2**(((N-1).bit_length() + 7) // 8 - 1).bit_length()}'
    arr = np.empty((*repeat(N, D), D), dtype=dtype)
    arrs = *accumulate(chain((arr,), repeat(0, D)), np.ndarray.__getitem__),
    rng = np.arange(N, dtype=dtype)
    idx = slice(None), *repeat(None, D-1)
    for i in range(D-1, 0, -1):
        arrs[i][..., i] = rng[idx[:D-i]]
        arrs[i-1][1:] = arrs[i]
    arr[..., 0] = rng[idx]
    return arr.reshape(-1, D)

16^7 is no prob with this function on my 8GB laptop:

>>> cartesian_power(16, 7)
array([[ 0,  0,  0, ...,  0,  0,  0],
       [ 0,  0,  0, ...,  0,  0,  1],
       [ 0,  0,  0, ...,  0,  0,  2],
       ...,
       [15, 15, 15, ..., 15, 15, 13],
       [15, 15, 15, ..., 15, 15, 14],
       [15, 15, 15, ..., 15, 15, 15]], dtype=uint8)
Paul Panzer
  • 51,835
  • 3
  • 54
  • 99
0

As long as you want to save all the numbers in a list it is memory inefficient. Look at the calculation to get intuition about it's memory consumption.

Dimension = 7, N = 16
Total numbers generated = 16*16*...*16 (7 times) = 268435456
No. of elements used to represent a number = 7
Total numbers used = 268435456*7 = 268435456
Size used to represent an int in Python2 = 24 bytes
Total memory consumption = 268435456*24 = 6442450944 bytes = 6GB + Extra Overhead of each lists and lists of lists.

link

Instead of generating the entire sequence, you can create an generator as suggested by @mamun, but iterating over it would be very time consuming.

Community
  • 1
  • 1
Raj
  • 664
  • 4
  • 16