6

I have a 3D array and I need to "squeeze" it over the last axis, so that I get a 2D array. I need to do it in the following way. For each values of the indices for the first two dimensions I know the value of the index for the 3rd dimension from where the value should be taken.

For example, I know that if i1 == 2 and i2 == 7 then i3 == 11. It means that out[2,7] = inp[2,7,11]. This mapping from first two dimensions into the third one is given in another 2D array. In other words, I have an array in which on the position 2,7 I have 11 as a value.

So, my question is how to combine these two array (3D and 2D) to get the output array (2D).

Saullo G. P. Castro
  • 56,802
  • 26
  • 179
  • 234
Roman
  • 124,451
  • 167
  • 349
  • 456

4 Answers4

2
In [635]: arr = np.arange(24).reshape(2,3,4)
In [636]: idx = np.array([[1,2,3],[0,1,2]])


In [637]: I,J = np.ogrid[:2,:3]
In [638]: arr[I,J,idx]
Out[638]: 
array([[ 1,  6, 11],
       [12, 17, 22]])
In [639]: arr
Out[639]: 
array([[[ 0,  1,  2,  3],   # 1
        [ 4,  5,  6,  7],   # 6
        [ 8,  9, 10, 11]],  # ll

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

I,J broadcast together to select a (2,3) set of values, matching idx:

In [640]: I
Out[640]: 
array([[0],
       [1]])
In [641]: J
Out[641]: array([[0, 1, 2]])

This is a generalization to 3d of the easier 2d problem - selecting one item from each row:

In [649]: idx
Out[649]: 
array([[1, 2, 3],
       [0, 1, 2]])
In [650]: idx[np.arange(2), [0,1]]
Out[650]: array([1, 1])

In fact we could convert the 3d problem into a 2d one:

In [655]: arr.reshape(6,4)[np.arange(6), idx.ravel()]
Out[655]: array([ 1,  6, 11, 12, 17, 22])

Generalizing the original case:

In [55]: arr = np.arange(24).reshape(2,3,4)                                     
In [56]: idx = np.array([[1,2,3],[0,1,2]])                                      
In [57]: IJ = np.ogrid[[slice(i) for i in idx.shape]]                           
In [58]: IJ                                                                     
Out[58]: 
[array([[0],
        [1]]), array([[0, 1, 2]])]
In [59]: (*IJ,idx)                                                              
Out[59]: 
(array([[0],
        [1]]), array([[0, 1, 2]]), array([[1, 2, 3],
        [0, 1, 2]]))
In [60]: arr[_]                                                                 
Out[60]: 
array([[ 1,  6, 11],
       [12, 17, 22]])

The key is in combining the IJ list of arrays with the idx to make a new indexing tuple. Constructing the tuple is a little messier if idx isn't the last index, but it's still possible. E.g.

In [61]: (*IJ[:-1],idx,IJ[-1])                                                  
Out[61]: 
(array([[0],
        [1]]), array([[1, 2, 3],
        [0, 1, 2]]), array([[0, 1, 2]]))
In [62]: arr.transpose(0,2,1)[_]                                                
Out[62]: 
array([[ 1,  6, 11],
       [12, 17, 22]])

Of if it's easier transpose arr to the idx dimension is last. The key is that the index operation takes a tuple of index arrays, arrays which broadcast against each other to select specific items. That's what ogrid is doing, create the arrays that work with idx.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • @hpaulf How would I use this solution for an 'idx' of arbitrary dimension? That is, this works when idx is 2x3. What if I have a function that I need to accommodate the case 'idx' is 3x3 or 2x3x4. Note: in my example 'arr' always has one more dimension, and the values of 'idx' are valid indices. I know I can get the dimensions using idx.shape, but I guess I need to then translate that resulting tuple into the form ' :d1,:d2,:d3, ... ' where 'd1' is the first dimension size, 'd2' is the second, and so on. – profj Feb 27 '19 at 22:25
  • 1
    I added a generalization. – hpaulj Feb 27 '19 at 23:17
1
inp = np.random.random((20, 10, 5)) # simulate some input
i1, i2 = np.indices(inp.shape[:2])
i3 = np.random.randint(0, 5, size=inp.shape) # or implement whatever mapping
                                             # you want between (i1,i2) and i3
out = inp[(i1, i2, i3)]

See https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing for more details

AGN Gazer
  • 8,025
  • 2
  • 27
  • 45
0

Using numpy.einsum

This can be achieved by a combination of array indexing and usage of numpy.einsum:

>>> numpy.einsum('ijij->ij', inp[:, :, indices])

inp[:, :, indices] creates a four-dimensional array where for each of the first two indices (the first two dimensions) all indices of the index array are applied to the third dimension. Because the index array is two-dimensional this results in 4D. However you only want those indices of the index array which correspond to the ones of the first two dimensions. This is then achieved by using the string ijij->ij. This tells einsum that you want to select only those elements where the indices of 1st and 3rd and 2nd and 4th axis are similar. Because the last two dimensions (3rd and 4th) were added by the index array this is similar to selecting only the index index[i, j] for the third dimension of inp.

Note that this method can really blow up the memory consumption. Especially if inp.shape[:2] is much greater than inp.shape[2] then inp[:, :, indices].size will be approximately inp.size ** 2.

Building the indices manually

First we prepare the new index array:

>>> idx = numpy.array(list(
...     numpy.ndindex(*inp.shape[:2], 1)  # Python 3 syntax
... ))

Then we update the column which corresponds to the third axis:

>>> idx[:, 2] = indices[idx[:, 0], idx[:, 1]]

Now we can select the elements and simply reshape the result:

>>> inp[tuple(idx.T)].reshape(*inp.shape[:2])

Using numpy.choose

Note: numpy.choose allows a maximum size of 32 for the axis which is chosen from.


According to this answer and the documentation of numpy.choose we can also use the following:

# First we need to bring the last axis to the front because
# `numpy.choose` chooses from the first axis.
>>> new_inp = numpy.moveaxis(inp, -1, 0)
# Now we can select the elements.
>>> numpy.choose(indices, new_inp)

Although the documentation discourages the use of a single array for the 2nd argument (the choices)

To reduce the chance of misinterpretation, even though the following “abuse” is nominally supported, choices should neither be, nor be thought of as, a single array, i.e., the outermost sequence-like container should be either a list or a tuple.

this seems to be the case only for preventing misunderstandings:

choices : sequence of arrays

Choice arrays. a and all of the choices must be broadcastable to the same shape. If choices is itself an array (not recommended), then its outermost dimension (i.e., the one corresponding to choices.shape[0]) is taken as defining the “sequence”.

So from my point of view there's nothing wrong with using numpy.choose that way, as long as one is aware of what they're doing.

a_guest
  • 34,165
  • 12
  • 64
  • 118
-1

I believe this should do it:

for i in range(n):
    for j in range(m):
        k = index_mapper[i][j]
        value = input_3d[i][j][k]
        out_2d[i][j] = value 
havanagrawal
  • 1,039
  • 6
  • 12
  • I didn't vote on this. But you are just wrapping the OPs `out[i,j] = inp[i,j,k]` in 2 loops. It's the right approach for nested lists, But the indexing style and tag indicate that the OP wants a numpy array solution. – hpaulj Jul 02 '17 at 17:12