1
import numpy as np
# The 3D arrays have the axis: Z, X, Y
arr_keys = np.random.rand(20, 5, 5)
arr_vals = np.random.rand(20, 5, 5)
arr_idx = np.random.rand(5, 5)

For each grid cell in arr_idx, I want to look up the Z-position of the value closest to it in arr_keys (but with the same X, Y location) and return the value at the corresponding position in arr_vals array. Is there a way to do this without using nested for loops?

So, if the value at X=0, Y=0 for arr_idx is 0.5, I want to find the number closest to it at X=0, Y=0, Z ranges from 0 to 10 in arr_keys, and then I want to use the Z position of that number (lets call it Z_prime) to find the value in arr_vals (Z_prime, X=0, Y=0)

user308827
  • 21,227
  • 87
  • 254
  • 417

4 Answers4

2

This is the type of problem for which np.take_along_axis was created:

# shape (20, 5, 5)
diff = np.abs(arr_idx - arr_keys)

# argmin(..., keepdims=True) doesn't exist yet - this emulates it
# shape (1, 5, 5)
inds = np.expand_dims(np.argmin(diff, axis=0), axis=0)

# shape (1, 5, 5)
res = np.take_along_axis(arr_vals, inds, axis=0)

# shape (5, 5)
res = res.squeeze(axis=0)
Eric
  • 95,302
  • 53
  • 242
  • 374
1

I think this might work: roll the axes into the correct orientation, find the index of the value of the (absolute) minimum for each of the 5x5 X,Y values and take the corresponding Z-values from arr_vals:

idx = np.argmin(np.abs(np.rollaxis(arr_keys,0,3) - arr_idx[:,:,None]), axis=2)
i,j = np.ogrid[:5,:5]
arr_vals[idx[i,j],i,j]

To test this, try the (3,2,2) case:

In [15]: arr_keys
Out[15]: 
array([[[ 0.19681533,  0.26897784],
        [ 0.60469711,  0.09273087]],

       [[ 0.04961604,  0.3460404 ],
        [ 0.88406912,  0.41284309]],

       [[ 0.46298201,  0.33809574],
        [ 0.99604152,  0.4836324 ]]])

In [16]: arr_vals
Out[16]: 
array([[[ 0.88865681,  0.88287688],
        [ 0.3128103 ,  0.24188022]],

       [[ 0.23947227,  0.57913325],
        [ 0.85768064,  0.91701097]],

       [[ 0.78105669,  0.84144339],
        [ 0.81071981,  0.69217687]]])

In [17]: arr_idx
Out[17]: 
array([[[ 0.31352609],
        [ 0.75462329]],

       [[ 0.44445286],
        [ 0.97086161]]])

gives:

array([[ 0.88865681,  0.57913325],
       [ 0.3128103 ,  0.69217687]])
xnx
  • 24,509
  • 11
  • 70
  • 109
  • thanks! what is `np.rollaxis(arr_keys,0,3)` doing conceptually? Does the `3` change if my array shape becomes (200, 100, 50)? – user308827 Aug 31 '18 at 19:56
  • 1
    It's rolling the first axis "back" until it ends up as the last one, so (20,5,5) becomes (5,5,20): see https://docs.scipy.org/doc/numpy/reference/generated/numpy.rollaxis.html – apparently this function is to be replaced with moveaxis in NumPy version 1.11 – xnx Aug 31 '18 at 20:00
  • ah ok, and are you rolling it so that it can be compared to `arr_idx`? – user308827 Aug 31 '18 at 20:02
  • Ah, @user308827 you see this is what I meant when I said that `np.reshape` wouldn't "preserve the pattern" - reshaping the array wouldn't be the same as pushing the first axis back. Also, I didn't know about `rollaxis`, so I used the inelegant method of splitting and stacking along the last axis. – Rohan Saxena Aug 31 '18 at 20:03
  • @xnx, one more query. `idx` has the shape 5, 5. How does it look into the Z-axis in `arr_vals` when it lacks a 3rd axis? – user308827 Aug 31 '18 at 20:07
  • @user308827 I've made a correction and tried a test on a smaller case. You need NumPy's "fancy indexing" to use an array of indices to index into an array, but not in the way I had it originally. – xnx Sep 01 '18 at 05:45
1

I think @xnx's answer is pretty good. Mine is longer but I'll post it anyway ;).

Also, a note: NumPy is made to handle large multi-dimensional arrays efficiently by vectorizing the operations. So I'd suggest avoiding for loops as much as possible. Whatever the task you're looking for, there is (usually) a way to do it while avoiding loops.

arr_keys = np.split(arr_keys, 20)
arr_keys = np.stack(arr_keys, axis=-1)[0]
arr_vals = np.split(arr_vals, 20)
arr_vals = np.stack(arr_vals, axis=-1)[0]

arr_idx = np.expand_dims(arr_idx, axis=-1)

difference = np.abs(arr_keys - arr_idx)
minimum = np.argmin(difference, axis=-1)

result = np.take_along_axis(arr_vals, np.expand_dims(minimum, axis=-1), axis=-1)
result = np.squeeze(result, axis=-1)
Rohan Saxena
  • 3,133
  • 2
  • 16
  • 34
0

A little verbose than the already posted solution but easier to understand.

import numpy as np
# The 3D arrays have the axis: Z, X, Y
arr_keys = np.random.rand(20, 5, 5)
arr_vals = np.random.rand(20, 5, 5)
arr_idx = np.random.rand(5, 5)

arr_idx = arr_idx[np.newaxis, :, :]
dist = np.abs(arr_idx - arr_keys)
dist_ind = np.argmin(dist, axis=0)
x = np.arange(0, 5, 1)
y = np.arange(0, 5, 1)
xx, yy = np.meshgrid(x, y)

res = arr_vals[dist_ind, yy, xx]
Autonomous
  • 8,935
  • 1
  • 38
  • 77