84

Is there function to get an iterator over an arbitrary dimension of a numpy array?

Iterating over the first dimension is easy...

In [63]: c = numpy.arange(24).reshape(2,3,4)

In [64]: for r in c :
   ....:     print r
   ....: 
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[[12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]

But iterating over other dimensions is harder. For example, the last dimension:

In [73]: for r in c.swapaxes(2,0).swapaxes(1,2) :
   ....:     print r
   ....: 
[[ 0  4  8]
 [12 16 20]]
[[ 1  5  9]
 [13 17 21]]
[[ 2  6 10]
 [14 18 22]]
[[ 3  7 11]
 [15 19 23]]

I'm making a generator to do this myself, but I'm surprised there isn't a function named something like numpy.ndarray.iterdim(axis=0) to do this automatically.

AFoglia
  • 7,968
  • 3
  • 35
  • 51

6 Answers6

74

What you propose is quite fast, but the legibility can be improved with the clearer forms:

for i in range(c.shape[-1]):
    print c[:,:,i]

or, better (faster, more general and more explicit):

for i in range(c.shape[-1]):
    print c[...,i]

However, the first approach above appears to be about twice as slow as the swapaxes() approach:

python -m timeit -s 'import numpy; c = numpy.arange(24).reshape(2,3,4)' \
    'for r in c.swapaxes(2,0).swapaxes(1,2): u = r'
100000 loops, best of 3: 3.69 usec per loop

python -m timeit -s 'import numpy; c = numpy.arange(24).reshape(2,3,4)' \
    'for i in range(c.shape[-1]): u = c[:,:,i]'
100000 loops, best of 3: 6.08 usec per loop

python -m timeit -s 'import numpy; c = numpy.arange(24).reshape(2,3,4)' \
    'for r in numpy.rollaxis(c, 2): u = r'
100000 loops, best of 3: 6.46 usec per loop

I would guess that this is because swapaxes() does not copy any data, and because the handling of c[:,:,i] might be done through general code (that handles the case where : is replaced by a more complicated slice).

Note however that the more explicit second solution c[...,i] is both quite legible and quite fast:

python -m timeit -s 'import numpy; c = numpy.arange(24).reshape(2,3,4)' \
    'for i in range(c.shape[-1]): u = c[...,i]'
100000 loops, best of 3: 4.74 usec per loop
Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
  • 1
    If the goal is to iterate the last dimension, why not use `for r in range c.T` or more generally [`c.transpose`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.transpose.html)? Also, since numpy 1.10, it should be possible to use [`np.moveaxis(c, dim, 0)`](https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.moveaxis.html). – Mr Tsjolder from codidact Aug 02 '18 at 10:15
  • Transpose does not generally give the desired behavior, because the axis are all inverted (the before-last axis becomes the second one, etc.), so you would need another transpose when giving the result: the double transposition is a unnecessary hop. `moveaxis` sounds fine, but the transform adds one code line: it's not clear that it's worth moving the last axis in front, since NumPy directly gives you access to it (through the code in this answer) . – Eric O. Lebigot Aug 02 '18 at 10:25
  • "moveaxis sounds fine, but the transform adds one code line: it's not clear that it's worth moving the last axis in front" How so? It would be just `for u in np.moveaxis(c, -1, 0)`. In fact, I tested `python -m timeit -s 'import numpy; c = numpy.zeros((32, 32, 1024))' 'for i in range(c.shape[-1]): u = c[...,i]'` against `python -m timeit -s 'import numpy; c = numpy.zeros((32, 32, 1024))' 'for u in numpy.moveaxis(c, -1, 0): pass'` and the latter was twice as fast on a Ryzen 3900 CPU. – Hyperplane Jun 14 '21 at 20:38
34

I'd use the following:

c = numpy.arange(2 * 3 * 4)
c.shape = (2, 3, 4)

for r in numpy.rollaxis(c, 2):
    print(r)

The function rollaxis creates a new view on the array. In this case it's moving axis 2 to the front, equivalent to the operation c.transpose(2, 0, 1).

Eryk Sun
  • 33,190
  • 5
  • 92
  • 111
9

So, one can iterate over the first dimension easily, as you've shown. Another way to do this for arbitrary dimension is to use numpy.rollaxis() to bring the given dimension to the first (the default behavior), and then use the returned array (which is a view, so this is fast) as an iterator.

In [1]: array = numpy.arange(24).reshape(2,3,4)

In [2]: for array_slice in np.rollaxis(array, 1):
   ....:     print array_slice.shape
   ....:
(2, 4)
(2, 4)
(2, 4)

EDIT: I'll comment that I submitted a PR to numpy to address this here: https://github.com/numpy/numpy/pull/3262. The concensus was that this wasn't enough to add to the numpy codebase. I think using np.rollaxis is the best way to do this, and if you want an interator, wrap it in iter().

giessel
  • 452
  • 4
  • 5
5

I guess there is no function. When I wrote my function, I ended up taking the iteration EOL also suggested. For future readers, here it is:

def iterdim(a, axis=0) :
  a = numpy.asarray(a);
  leading_indices = (slice(None),)*axis
  for i in xrange(a.shape[axis]) :
    yield a[leading_indices+(i,)]
AFoglia
  • 7,968
  • 3
  • 35
  • 51
  • The standard NumPy syntax `a[..., i]` would be lighter and would remove the need for `leading_indices`. – Eric O. Lebigot Jul 07 '17 at 17:40
  • 2
    @EOL but that would work only for the last axis, with leading_indices its more general... – lukas Nov 14 '17 at 11:05
  • Good point @lukas: the initial question indeed mentions iterating "over an arbitrary dimension"—while I had in mind integrating over the last dimension. – Eric O. Lebigot Nov 14 '17 at 21:05
5

The following is exactly what you are looking for:

for y in np.moveaxis(x, axis, 0):
Hyperplane
  • 1,422
  • 1
  • 14
  • 28
3

You can use numpy.shape to get dimensions, and then range to iterate over them.

n0, n1, n2 = numpy.shape(c)

for r in range(n0):
    print(c[r,:,:])
  • this misses the point of the OP. They want to be able to iterate over arbitrary dimensions, this code only allows iterating over `n0`. – Brett May 30 '23 at 15:15
  • Yeah, but you can adapt the concept to iterate over the other dimensions, as: for r in range(n1): print(c[:,r,:]) ; for r in range(n2): print(c[:,:,r]) – barbedorafael Jun 30 '23 at 10:28