How to get a 2D array containing indices of another 2D array

Question

Problem

import numpy as np

I have an an array, without any prior information of its contents. For example:

ourarray = \
np.array([[0,1],
          [2,3],
          [4,5]])

I want to get the pairs of numbers which can be used for indexing ourarray. Ie I want to get:

array([[0, 0, 1, 1, 2, 2],
       [0, 1, 0, 1, 0, 1]])

(0,0, 0,1, 1,0, etc., all the possible indices of ourarray are in this array.)

Attempt 1 (Successful but inefficient)

I can get this array by:

np.array(np.where(np.ones(ourarray.shape)))

Which gives the desired result but it requires creting np.ones(ourarray.shape), which seems like not an efficient way of doing it.

Attempt 2 (Failed)

I also tried:

np.array(np.where(ourarray))

which does not work because there is no indices returned for the 0 entry of ourarray.

Question

Attempt 1 works, but I am looking for a more efficient way. How can I do this more efficiently?

I'mahdi · Answer 1 · 2021-10-06T12:28:49.223

2

You can use numpy.argwhere then use .T and get what you want.

try this:

>>> ourarray = np.array([[0,1],[2,3], [4,5]])
>>> np.argwhere(ourarray>=0).T
array([[0, 0, 1, 1, 2, 2],
       [0, 1, 0, 1, 0, 1]])

If maybe any values exist in your array you can use this:

ourarray = np.array([[np.nan,1],[2,np.inf], [-4,-5]])
np.argwhere(np.ones(ourarray.shape)==1).T
# array([[0, 0, 1, 1, 2, 2],
#        [0, 1, 0, 1, 0, 1]])

edited Oct 06 '21 at 12:28

answered Oct 06 '21 at 12:16

I'mahdi

23,382
5
22
30

Thank you. What would you use if I don't know anything about the array (I don't), so if some values could be negative? – zabop Oct 06 '21 at 12:17
Or not even negative, but NaN, for example, anything. – zabop Oct 06 '21 at 12:17
1

Thank you, awesome! I'll leave the question open for a wile in case someone has a solution without creating an array with np.ones (like this post does or my question, see Attempt 1). But in any case, definitely worth my +1, and if there won't be any more attractive solution, I'll accept this anwer. – zabop Oct 06 '21 at 12:34
1

`argwhere` is `transpose(where(...))`, turning the tuple into an array. This doesn't address the OP's efficiency worry, since you still create the full `ones` or equivalent. – hpaulj Oct 06 '21 at 14:49
@hpaulj thanks for your comment, I'm here for learning, I definitely approve your comment if you help me I can find a better approach or send your answer. – I'mahdi Oct 06 '21 at 15:00

hpaulj · Accepted Answer · 2021-10-06T16:49:26.983

How do you intend to use this index?

The tuple produced by nonzero (where) is designed for convenient indexing:

In [54]: idx = np.nonzero(np.ones_like(ourarray))
In [55]: idx
Out[55]: (array([0, 0, 1, 1, 2, 2]), array([0, 1, 0, 1, 0, 1]))
In [56]: ourarray[idx]
Out[56]: array([0, 1, 2, 3, 4, 5])

or equivalently using the 2 arrays explicitly:

In [57]: ourarray[idx[0], idx[1]]
Out[57]: array([0, 1, 2, 3, 4, 5])

Your np.array(idx) can be used as in [57] but not as in [56]. The use of a tuple in [56] is important.

If we apply transpose to this we get an array.

In [58]: tidx = np.transpose(idx)
In [59]: tidx
Out[59]: 
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1],
       [2, 0],
       [2, 1]])

to use that for indexing we have to iterate:

In [60]: [ourarray[i,j] for i,j in tidx]
Out[60]: [0, 1, 2, 3, 4, 5]

argwhere as proposed in the other answer is just the transpose. Using outarray>=0 is really no different from the np.ones expression. Both make an array that is True/1 for all elements.

In [61]: np.argwhere(np.ones_like(ourarray))
Out[61]: 
array([[0, 0],
       [0, 1],
       [1, 0],
       [1, 1],
       [2, 0],
       [2, 1]])

There are other ways of generating indices, np.indices, np.meshgrid , np.mgrid, np.ndindex, but they will require some sort of reshaping and/or transpose to get exactly what you want:

In [71]: np.indices(ourarray.shape)
Out[71]: 
array([[[0, 0],
        [1, 1],
        [2, 2]],

       [[0, 1],
        [0, 1],
        [0, 1]]])
In [72]: np.indices(ourarray.shape).reshape(2,6)
Out[72]: 
array([[0, 0, 1, 1, 2, 2],
       [0, 1, 0, 1, 0, 1]])

timings

If ourarray>=0 works, it is faster than np.ones:

In [79]: timeit np.ones_like(ourarray)
6.22 µs ± 11.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [80]: timeit ourarray>=0
1.43 µs ± 15 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

np.where/nonzero adds a non-trivial time to that:

In [81]: timeit np.nonzero(ourarray>=0)
6.43 µs ± 8.15 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

and a bit more time to convert the tuple to array:

In [82]: timeit np.array(np.nonzero(ourarray>=0))
10.4 µs ± 35.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The transpose round trip of argwhere adds more time:

In [83]: timeit np.argwhere(ourarray>=0).T
16.9 µs ± 35.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

indices is about the same as [82], though it may scale differently.

In [84]: timeit np.indices(ourarray.shape).reshape(2,-1)
10.9 µs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)