4

I have a function that produces an array like this:

my_array = np.array([list(str(i).zfill(4)) for i in range(10000)], dtype=int)

Which outputs:

array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 0, 0, 2],
       ...,
       [9, 9, 9, 7],
       [9, 9, 9, 8],
       [9, 9, 9, 9]])

As you can see by converting ints to strings and lists, and then back to int, this is highly inefficient, and my real needs is for a much larger array (larger range). I tried looking into numpy to find a more efficient way to generate this array / list, but could not find a way. The best i've got so far is arange which will give a range from 1...9999 but not separated into lists.

Any ideas?

Ofer Sadan
  • 11,391
  • 5
  • 38
  • 62

5 Answers5

3

Here's one based on cartesian_product_broadcasted -

import functools

def cartesian_product_ranges(shape, out_dtype='int'):
    arrays = [np.arange(s, dtype=out_dtype) for s in shape]
    broadcastable = np.ix_(*arrays)
    broadcasted = np.broadcast_arrays(*broadcastable)
    rows, cols = functools.reduce(np.multiply, broadcasted[0].shape), \
                                                  len(broadcasted)
    out = np.empty(rows * cols, dtype=out_dtype)
    start, end = 0, rows
    for a in broadcasted:
        out[start:end] = a.reshape(-1)
        start, end = end, end + rows
    N = len(shape)
    return np.moveaxis(out.reshape((-1,) + tuple(shape)),0,-1).reshape(-1,N)

Sample run -

In [116]: cartesian_product_ranges([3,2,4])
Out[116]: 
array([[0, 0, 0],
       [0, 0, 1],
       [0, 0, 2],
       [0, 0, 3],
       [0, 1, 0],
       [0, 1, 1],
       [0, 1, 2],
       [0, 1, 3],
       [1, 0, 0],
       [1, 0, 1],
       [1, 0, 2],
       [1, 0, 3],
       [1, 1, 0],
       [1, 1, 1],
       [1, 1, 2],
       [1, 1, 3],
       [2, 0, 0],
       [2, 0, 1],
       [2, 0, 2],
       [2, 0, 3],
       [2, 1, 0],
       [2, 1, 1],
       [2, 1, 2],
       [2, 1, 3]])

Run and timings on 10-ranged array with 4 cols -

In [119]: cartesian_product_ranges([10]*4)
Out[119]: 
array([[0, 0, 0, 0],
       [0, 0, 0, 1],
       [0, 0, 0, 2],
       ...,
       [9, 9, 9, 7],
       [9, 9, 9, 8],
       [9, 9, 9, 9]])

In [120]: cartesian_product_ranges([10]*4).shape
Out[120]: (10000, 4)

In [121]: %timeit cartesian_product_ranges([10]*4)
10000 loops, best of 3: 105 µs per loop

In [122]: %timeit np.array([list(str(i).zfill(4)) for i in range(10000)], dtype=int)
100 loops, best of 3: 16.7 ms per loop

In [123]: 16700.0/105
Out[123]: 159.04761904761904

Around 160x speedup!

For 10-ranged array with 9 columns, we can use lower-precision uint8 dtype -

In [7]: %timeit cartesian_product_ranges([10]*9, out_dtype=np.uint8)
1 loop, best of 3: 3.36 s per loop
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • +1 for insane speedup. Also give an immediate `MemoryError` when running with too many columns, so it does not overflow like my solution. – dennlinger Jul 20 '18 at 07:38
  • 1
    @dennlinger Check out the edits and for 9 cols, edits at the end of the post. – Divakar Jul 20 '18 at 08:00
  • 2
    Wow that really is insane speedup, although the code itself isn't quite understood for me... but that's for me to learn more :) Thanks for the help! Selecting this answer purely for the insane speed, although I loved all the different answers and methods other people came up with. – Ofer Sadan Jul 20 '18 at 08:23
1

I would solve this with a combination of np.tile and np.repeat and try to assemble the rows, then np.column_stack them.

This pure Numpy solution becomes nearly a one-liner then:

n = 10000

x = np.arange(10)

a = [np.tile(np.repeat(x, 10 ** k), n/(10 ** (k+1))) for k in range(int(np.log10(n)))]

y = np.column_stack(a[::-1]) # flip the list, first entry is rightmost row

A more verbose version to see what happens can be written like that

n = 10000

x = np.arange(10)

x0 = np.tile(np.repeat(x, 1), n/10)
x1 = np.tile(np.repeat(x, 10), n/100)
x2 = np.tile(np.repeat(x, 100), n/1000)

Now replace the numbers with exponents and get the number of columns using the log10.

Speed test:

import timeit

s = """
    n = 10000
    x = np.arange(10)
    a = [np.tile(np.repeat(x, 10 ** k), n/(10 ** (k+1))) for k in range(int(np.log10(n)))]
    y = np.column_stack(a[::-1])
    """
n_runs = 100000
t = timeit.timeit(s,
                  "import numpy as np",
                  number=n_runs)

print(t, t/n_runs)

About 260 µs on my slow machine (7 years old).

Joe
  • 6,758
  • 2
  • 26
  • 47
  • Yours is definitely one of the faster ones in here, I've selected one of the others simply for the insane speeds, but this also offers some readability that I and others using the code can better understand – Ofer Sadan Jul 20 '18 at 08:27
  • Did you compare the speed of the answers on your machine? – Joe Jul 20 '18 at 08:42
  • I did yes, an old machine too btw, yours was a close second – Ofer Sadan Jul 20 '18 at 10:32
  • Argh :) Let's see if there is potential for optimization – Joe Jul 20 '18 at 12:50
1

You can user itertools.product for this. Simply provide range(10) as an argument, and the number of digits you want as the argument for repeat.

Conveniently, the itertools iterator returns the elements in sorted order, so you do not have to perform a secondary sorting step by yourself.

Below is an evaluation of my code:

import timeit


if __name__ == "__main__":
    # time run: 14.20635
    print(timeit.timeit("np.array([list(str(i).zfill(4)) for i in range(10000)], dtype=int)",
                  "import numpy as np",
                  number=1000))

    # time run: 5.00319
    print(timeit.timeit("np.array(list(itertools.product(range(10), r=4)))",
                        "import itertools; import numpy as np",
                        number=1000))
dennlinger
  • 9,890
  • 1
  • 42
  • 63
  • 1
    Very nice solution, and about 3x faster :) – Ofer Sadan Jul 20 '18 at 06:47
  • Thanks. Generally, if you look for performance, itertools is the way to go! You generally sacrifice a bit of readability for the sake of performance, but in your case even that is not too much of an issue! Also note that this will scale slightly better than your provided method (try it with, e.g., 10k elements instead). – dennlinger Jul 20 '18 at 06:48
  • That's what i'm testing now, because the solution i'm looking for has to be in 9 digits (instead of 4) so i'm trying out the difference now – Ofer Sadan Jul 20 '18 at 06:49
  • sadly trying that with 9 digits causes a `MemoryError`, which makes sense... i'll try to test it in other ways – Ofer Sadan Jul 20 '18 at 06:54
  • You are correct. I just tried it on a 48GB RAM machine, and it still wasn't enough to complete the run. Doing the same with 8 digits runs in ~100 seconds. – dennlinger Jul 20 '18 at 07:01
1

A fast solution is to use np.meshgrid to create all the columns. Then sort the columns on for instance element 123 or 1234 so that they are in the right order. And then just make an array out of them.

n_digits = 4
digits = np.arange(10)
columns = [c.ravel() for c in np.meshgrid(*[digits]*n_digits)]
out_array = columns.sort(key=lambda x: x[int("".join(str(d) for d in range(n_digits)))])
out_array = np.array(columns).T
np.all(out_array==my_array)
kuppern87
  • 1,125
  • 9
  • 14
1

There are other one-liners to solve this

import numpy as np
y = np.array([index for index in np.ndindex(10, 10, 10, 10)])

This seems to be much slower.

Or

import numpy as np
from sklearn.utils.extmath import cartesian

x = np.arange(10)
y = cartesian((x, x, x, x))

This seems to be slightly slower than the accepted answer.

Joe
  • 6,758
  • 2
  • 26
  • 47