How do I handle weird indexing behavior with referencing coordinates of numpy array?

Question

As part of a larger project, I am generating a bunch of different coordinate lists of varying sizes and I found some odd behavior when trying to use these coordinate list as indexes of arrays. These coordinate lists are generated in the program so I do not know how long they will be. See below for an example:

t = np.zeros((5,5))
coord = [[2,3], [1,2]]
t[coord] = 30
print(t)

Output:

[[ 0.  0.  0.  0.  0.]
[ 0.  0.  0.  0.  0.]
[ 0. 30.  0.  0.  0.]
[ 0.  0. 30.  0.  0.]
[ 0.  0.  0.  0.  0.]]

But then if the list has only one point:

t = np.zeros((5,5))
coord = [[2,3]]
t[coord] = 30
print(t)

Output:

[[ 0.  0.  0.  0.  0.]
[ 0.  0.  0.  0.  0.]
[30. 30. 30. 30. 30.]
[30. 30. 30. 30. 30.]
[ 0.  0.  0.  0.  0.]]

Then if I convert the list to a numpy array, it breaks down even further:

t = np.zeros((5,5))
coord = np.array([[2,3], [1,2]])
t[coord] = 30
print(t)

Output:

[[ 0.  0.  0.  0.  0.]
[30. 30. 30. 30. 30.]
[30. 30. 30. 30. 30.]
[30. 30. 30. 30. 30.]
[ 0.  0.  0.  0.  0.]]

How do I handle this so I always get the first output even if there is only one element and it is a numpy array?

Thanks!

EDIT:

What currently is happening in my code, is a program returns a numpy array of points:

array([[ 9,  5,  0],
       [ 4,  2,  2],
       [11,  4,  2],
       [ 5,  7,  2],
       [11, 12,  2],
       [12,  9,  0],
       [ 5,  4,  7],
       [ 3,  2,  1],
       ...

Then I want to use this to change these coordinate points in the larger 14 * 14 * 9 matrix. big_matrix[coord] = 0

EDIT2: based on comment from @hpaulj

Here is an example of the full scale issue:

coord = np.array([[ 4,  7,  0],
       [ 9,  6,  1],
       [ 8,  2,  0],
       [ 8,  7,  6],
       [ 3, 10,  4],
       [ 6,  4,  3],
       [10, 10,  3],
       [ 3,  2,  1]], dtype='int32')
matrix[coord]

returns:

array([[[[0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         ...,
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.],
         [0., 0., 0., ..., 0., 0., 0.]],

You may want to consider using [Numpy slice notation](https://docs.scipy.org/doc/numpy/reference/generated/numpy.s_.html) which should give you what you're looking for. — not link, Apr 08 '19 at 20:47
Possible duplicate of [Indexing a numpy array with a list of tuples](https://stackoverflow.com/questions/28491230/indexing-a-numpy-array-with-a-list-of-tuples) — Mark, Apr 08 '19 at 20:48
I tried to explain what is going on in your various cases, but it occurred to me that I'm not sure what you want. You may need to demonstrate the desired result, whether it means assigning values to specific points, or specific rows. — hpaulj, Apr 08 '19 at 22:19
Your examples were 2x2, which in a 2d array could be interpreted either way. The new example with a (8,3) array for a 3d case only works one way, as a set of 8 points. — hpaulj, Apr 08 '19 at 23:25
The above case is a simplified version. In the full scale project I saw the "selecting row" behavior when the list was of length 1. To prevent this behavior, trying to run `tuple(coord)` fixes the single point case but then performs the "selecting row" behavior on the multi point case. `numpy` is not interpreting it as a set of points. — Collin Cunningham, Apr 08 '19 at 23:34

hpaulj · Accepted Answer · 2019-04-08T23:36:26.917

Indexed assignment can obscure some details, which I think are clearer with the getitem equivalent.

In [88]: arr = np.arange(25).reshape(5,5)                                       
In [89]: arr                                                                    
Out[89]: 
array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24]])


In [90]: coord = [[2,3],[1,2]]                                                  
In [91]: arr[coord]                                                             
FutureWarning: Using a non-tuple sequence for multidimensional indexing 
is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the 
future this will be interpreted as an array index, `arr[np.array(seq)]`, 
which will result either in an error or a different result.

Out[91]: array([11, 17])

Correct indexing for a pair of points, applying [2,3] for 1st axis, [1,2] to 2nd:

In [92]: coord = ([2,3],[1,2])                                                  
In [93]: arr[coord]                                                             
Out[93]: array([11, 17])
In [94]: arr[[2,3], [1,2]]                                                      
Out[94]: array([11, 17])

Historically numpy was a bit sloppy, and interpreted a list of lists like a tuple of lists (under certain circumstances). Newer versions are trying to remove this inconsistency.

In [95]: coord = [[2,3]]                                                        
In [96]: arr[coord]                                                             
FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.

Out[96]: 
array([[10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])

In [97]: coord = ([2,3],)          # clearer - pick 2 rows, e.g. arr[[2,3],:]                                              
In [98]: arr[coord]                                                             
Out[98]: 
array([[10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19]])
In [99]: arr[2,3]                 # pick one point
Out[99]: 13
In [100]: coord = (2,3)                                                         
In [101]: arr[coord]                                                            
Out[101]: 13

With the array, there's none of this confusing lists for tuples:

In [102]: coord = np.array([[2,3], [1,2]])                                      
In [103]: arr[coord]                                                            
Out[103]: 
array([[[10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]],

       [[ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]]])

This picks a (2,2) block of rows. Your arr[coord]=30 obscured this pattern, since there were duplicates in the rows selection (and assignment is buffered). (for unbuffered assignment, test np.add.at(t,coord,30)).

If we explicitly tell it that coord applies to the 1st dimension, we the same array style of indexing:

In [111]: coord = [[2,3],[1,2]]                                                 
In [112]: arr[coord,:]                                                          
Out[112]: 
array([[[10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]],

       [[ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14]]])

Note the difference in shape if I use this last [coord,] with the 1 element list:

In [117]: coord = [[2,3]]                                                       
In [118]: arr[coord,]                                                           
Out[118]: 
array([[[10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19]]])
In [119]: _.shape                                                               
Out[119]: (1, 2, 5)

So make coord a tuple rather than a list, if you want each element to apply to a different dimension. Or use an array if you want it applied to just one dimension, or be explicit with the [coord,:] like notation.

If you take this array, transpose it, and split that into a tuple, you get index arrays for the 2 dimensions:

In [120]: coord = np.array([[2,3],[1,2]])                                       
In [121]: coord                                                                 
Out[121]: 
array([[2, 3],
       [1, 2]])
In [123]: tuple(coord.T)                                                        
Out[123]: (array([2, 1]), array([3, 2]))
In [124]: arr[tuple(coord.T)]                                                   
Out[124]: array([13,  7])

and with 4 points:

In [125]: coord = np.array([[2,3],[1,2],[0,0],[3,4]])                           
In [126]: arr[tuple(coord.T)]                                                   
Out[126]: array([13,  7,  0, 19])

I don't know if this will help or not, but np.where is often used to select points in an array:

The condition - multiples of 4:

In [135]: arr%4==0                                                              
Out[135]: 
array([[ True, False, False, False,  True],
       [False, False, False,  True, False],
       [False, False,  True, False, False],
       [False,  True, False, False, False],
       [ True, False, False, False,  True]])

Indices of these points - a tuple with an array for each dimension. That can be used directly as an index:

In [136]: np.where(arr%4==0)                                                    
Out[136]: (array([0, 0, 1, 2, 3, 4, 4]), array([0, 4, 3, 2, 1, 0, 4]))
In [137]: arr[_]                                                                
Out[137]: array([ 0,  4,  8, 12, 16, 20, 24])

argwhere applies np.transpose to that tuple, making a (n,2) array:

In [138]: np.argwhere(arr%4==0)                                                 
Out[138]: 
array([[0, 0],
       [0, 4],
       [1, 3],
       [2, 2],
       [3, 1],
       [4, 0],
       [4, 4]])

Those are the coordinates of the individual elements, but they can't be used directly as indices, except iteratively:

In [144]: [arr[i,j] for i,j in np.argwhere(arr%4==0)]                           
Out[144]: [0, 4, 8, 12, 16, 20, 24]

I think you are generating coordinates in this argwhere style, but you really need them in the where style - as a tuple of arrays.

Thank you so much for this in depth answer. Based on your answer I have been playing around and I am a bit confused what is happening when, I do the following: `coord = tuple(np.array([[11, 6], [ 8, 5]]))`. Is it buffering row selections because the tuple has numpy array elements? — Collin Cunningham, Apr 08 '19 at 22:27
What exactly are you trying to do with that last case? Assign values to rows 11,6,8 and 5, or to points (11,6) and (8,5), or to points (11,8) and (6,5)? Or something else? — hpaulj, Apr 08 '19 at 22:35
I want to assign values to the individual points at these "x,y" coordinates. — Collin Cunningham, Apr 08 '19 at 23:06

How do I handle weird indexing behavior with referencing coordinates of numpy array?

1 Answers1