1

I have a list pts containing N points (Python floats). I wish to construct a NumPy array of dimension N*N*N*3 such that the array is equivalent to:

for i in xrange(0, N):
    for j in xrange(0, N):
        for k in xrange(0, N):
            arr[i,j,k,0] = pts[i]
            arr[i,j,k,1] = pts[j]
            arr[i,j,k,2] = pts[k]

I am wondering how I can exploit the array broadcasting rules of NumPy and functions such as tile to simplify this.

Freddie Witherden
  • 2,369
  • 1
  • 26
  • 41

3 Answers3

3

I think that the following should work:

pts = np.array(pts)  #Skip if pts is a numpy array already
lp = len(pts)
arr = np.zeros((lp,lp,lp,3))
arr[:,:,:,0] = pts[:,None,None]  #None is the same as np.newaxis
arr[:,:,:,1] = pts[None,:,None]
arr[:,:,:,2] = pts[None,None,:]

A quick test:

import numpy as np
import timeit

def meth1(pts):
   pts = np.array(pts)  #Skip if pts is a numpy array already
   lp = len(pts)
   arr = np.zeros((lp,lp,lp,3))
   arr[:,:,:,0] = pts[:,None,None]  #None is the same as np.newaxis
   arr[:,:,:,1] = pts[None,:,None]
   arr[:,:,:,2] = pts[None,None,:]
   return arr

def meth2(pts):
   lp = len(pts)
   N = lp
   arr = np.zeros((lp,lp,lp,3))
   for i in xrange(0, N):
      for j in xrange(0, N):
         for k in xrange(0, N):
            arr[i,j,k,0] = pts[i]
            arr[i,j,k,1] = pts[j]
            arr[i,j,k,2] = pts[k]

   return arr

pts = range(10)
a1 = meth1(pts)
a2 = meth2(pts)

print np.all(a1 == a2)

NREPEAT = 10000
print timeit.timeit('meth1(pts)','from __main__ import meth1,pts',number=NREPEAT)
print timeit.timeit('meth2(pts)','from __main__ import meth2,pts',number=NREPEAT)

results in:

True
0.873255968094   #my way
11.4249279499    #original

So this new method is an order of magnitude faster as well.

mgilson
  • 300,191
  • 65
  • 633
  • 696
1
import numpy as np
N = 10
pts = xrange(0,N)
l = [ [ [ [ pts[i],pts[j],pts[k] ]  for k in xrange(0,N) ] for j in xrange(0,N) ] for i in xrange(0,N) ]
x = np.array(l, np.int32)
print x.shape # (10,10,10,3)
lucasg
  • 10,734
  • 4
  • 35
  • 57
1

This can be done in two lines:

def meth3(pts):
    arrs = np.broadcast_arrays(*np.ix_(pts, pts, pts))
    return np.concatenate([a[...,None] for a in arrs], axis=3)

However, this method is not as fast as mgilson's answer, because concatenate is annoyingly slow. A generalized version of his answer performs roughly as well, though, and can generate the result you want (i.e. an n-dimensional cartesian product contained within an n-dimensional grid) for any set of arrays.

def meth4(arrs):     # or meth4(*arrs) for a simplified interface
    arr = np.empty([len(a) for a in arrs] + [len(arrs)])
    for i, a in enumerate(np.ix_(*arrs)):
        arr[...,i] = a
    return arr

This accepts any sequence of sequences, as long as it can be converted into a sequence of numpy arrays:

>>> meth4([[0, 1], [2, 3]])
array([[[ 0.,  2.],
        [ 0.,  3.]],

       [[ 1.,  2.],
        [ 1.,  3.]]])

And the cost of this generality isn't too high -- it's only twice as slow for small pts arrays:

>>> (meth4([pts, pts, pts]) == meth1(pts)).all()
True
>>> %timeit meth4([pts, pts, pts])
10000 loops, best of 3: 27.4 us per loop
>>> %timeit meth1(pts)
100000 loops, best of 3: 13.1 us per loop

And it's actually a bit faster for larger ones (although the speed gain is probably due to my use of empty instead of zeros):

>>> pts = np.linspace(0, 1, 100)
>>> %timeit meth4([pts, pts, pts])
100 loops, best of 3: 13.4 ms per loop
>>> %timeit meth1(pts)
100 loops, best of 3: 16.7 ms per loop
Community
  • 1
  • 1
senderle
  • 145,869
  • 36
  • 209
  • 233
  • very nice. I always find that `numpy.ix_` is a little too dense for me to really grok (though I haven't tried to wrap my head around it too hard). To generalize, I would find it easier to construct tuples of length N, filled with `None` at every element except "i" (which would be `slice()`). I would use that to index `pts` and then use ellipsis on the LHS as you did. e.g. `idx = tuple(None if i==j else slice() for j in xrange(N)); arr[...,i] = pts[idx]`. – mgilson Oct 18 '12 at 15:40
  • It took me a while to grok `numpy.ix_` as well, until one day I accidentally reinvented it. But it's basically exactly what you do above; `ix_(pts, pts, pts)` just returns the tuple `(pts[:,None,None], pts[None,:,None], pts[None,None,:])`. – senderle Oct 18 '12 at 15:52
  • 1
    Oh -- Well then, I suppose that makes it easier for me to remember as well since I basically reinvented it as well ... Now I understand your solution better (Thanks). From an API standpoint, I'd probably define `meth4` as `def meth4(*arrs)` though. Then you call it as `meth4(pts,pts,pts)` instead of `meth4((pts,pts,pts))`. It just seems a little cleaner to me. – mgilson Oct 18 '12 at 16:01
  • What prevents arr[...,:] = np.ix_(*arrs) from working in place of an explicit for loop? (I've tried it and get an error regarding assigning an array element with a sequence.) – Freddie Witherden Oct 19 '12 at 23:00
  • @FreddieWitherden, the basic answer is that the value returned by `np.ix_(*arrs)` is not a `numpy` array. It's a tuple of arrays, each of which has a different shape, so `numpy` can't work with it directly. Furthermore, given `ix = np.ix_(*arrs)`, `arr[...,i] = ix[0]` does something quite different from `arr[...,i] = ix[1]`, because `ix[0]` and `ix[1]` have different shapes, and so the broadcasting rules are applied differently. Finally, `arr[...,:]` and `arr[:,:,:,:]` are the same; `numpy` doesn't distinguish strongly between ellipses and slices, so it can't tell which axis to run over. – senderle Oct 20 '12 at 01:45