too many indices for array when using np.where

Question

I have the code:

a=b=np.arange(9).reshape(3,3)
c=np.zeros(3)

for x in range(3):
    c[x]=np.average(b[np.where(a<x+3)])

The output of c is

>>>array([ 1. , 1.5, 2. ])

Instead of the for loop, I wanna use array (vectorization), then I did the following code:

a=b=np.arange(9).reshape(3,3)
c=np.zeros(3)
i=np.arange(3)
c[i]=np.average(b[np.where(a<i[:,None,None]+3)])

But it shows IndexError: too many indices for array

As for a<i[:,None,None]+3

it correctly shows

array([[[ True,  True,  True],
        [False, False, False],
        [False, False, False]],

       [[ True,  True,  True],
        [ True, False, False],
        [False, False, False]],

       [[ True,  True,  True],
        [ True,  True, False],
        [False, False, False]]], dtype=bool)

But when I use b[np.where(a<i[:,None,None]+3)], it again shows IndexError: too many indices for array. I cannot get the correct output of c.

Divakar · Answer 1 · 2017-10-13T18:37:05.540

1

I am sensing you are trying to vectorize things here, though not explicitly mentioned. Now, I don't think you can index like that in a vectorized manner. To solve your qustion in a vectorized manner, I would suggest a more efficient way to get the sum-reduction with matrix-multiplication using np.tensordot and with help from broadcasting as you had set out already in your trials.

Thus, one solution would be -

from __future__ import division

i = np.arange(3)
mask = a<i[:,None,None]+3
c = np.tensordot(b,mask,axes=((0,1),(1,2)))/mask.sum((1,2))

Related post to understand tensordot.

Possible improvements on performance

Convert the mask to float dtype before feeding to np.dot as BLAS based matrix-multiplication would be faster with it.
Use np.count_nonzero instead of np.sum for counting booleans. So, use it to replace mask.sum() part.

edited Oct 13 '17 at 18:37

answered Oct 13 '17 at 18:26

Divakar

218,885
19
262
358

Do the sum or average is just an example, actually I just wanna get ' b[np.where(a – kinder chen Oct 13 '17 at 19:51
@kinderchan If I understand correctly, you can use : np.broadcast_to(b, mask.shape)[mask]? – Divakar Oct 13 '17 at 20:08
I tried, it doesn't work, I also use b[mask,i[:,None]], it failed either. – kinder chen Oct 13 '17 at 22:27
@kinderchan You need to explain your "doesn't work" part better. I would suggest editing your question and tell us what you expect to get with something like : `b[np.where(a – Divakar Oct 13 '17 at 22:35
I re-editted the question, the problem is `np.where(a – kinder chen Oct 16 '17 at 22:42
@kinderchan Let me repeat - Please edit the question and tell us the **output that you expect with : `b[np.where(a – Divakar Oct 17 '17 at 04:48
I changed my question again. As for your way, I checked `np.sum(b*mask,(1,2))/mask.sum((1,2))` is faster than `np.tensordot(b,mask,axes=((0,1),(1,2)))/mask.sum((1,2))`, and converting bool to float also makes the algorithm faster. I really appreciate your help. – kinder chen Oct 18 '17 at 01:31
@kinderchan Nah, you can't get your expected output with `np.where`. As for `tensordot` being slower, that's not possible, not at least with decent sized arrays or you are timing it wrongly. – Divakar Oct 18 '17 at 03:41

too many indices for array when using np.where

1 Answers1