1

I know this has been answered many times and I went through every SO question on this topic, but none of them seemed to tackle my problem.

This code yields an exception:

TypeError: only integer scalar arrays can be converted to a scalar index
a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9])

sindex = np.array([0, 3, 4])
eindex = np.array([2, 5, 6])

r = a[sindex: eindex]

I have an array with start indexes and another one with end indexes and I simply wanted to extract whatever is in between them. Notice the difference between sindex and eindex is constant, for example 2. So eindex is always what ever is in sindex + 2.

So the expected result should be:

[1, 2, 4, 5, 5, 6]

Is there a way to do this without a for loop?

Melron
  • 459
  • 4
  • 14

3 Answers3

2

For a constant interval difference, we can setup sliding windows and simply index with the starting indices array. Thus, we can use broadcasting_app or strided_app from this post -

d = 2  # interval difference

out = broadcasting_app(a, L = d, S = 1)[sindex].ravel()

out = strided_app(a, L = d, S = 1)[sindex].ravel()

Or use scikit-image's built-in view_as_windows -

from skimage.util.shape import view_as_windows

out = view_as_windows(a,d)[sindex].ravel()

To set d, we can use -

d = eindex[0] - sindex[0]
Divakar
  • 218,885
  • 19
  • 262
  • 358
1

You can't tell compiled numpy to take multiple slices directly. The alternatives to joining multiple slices involve some sort of advanced indexing.

 In [509]: a = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]) 
 ...:  
 ...: sindex = np.array([0, 3, 4]) 
 ...: eindex = np.array([2, 5, 6])   

The most obvious loop:

In [511]: np.hstack([a[i:j] for i,j in zip(sindex, eindex)])                         
Out[511]: array([1, 2, 4, 5, 5, 6])

A variation that uses the loop to construct indices first:

In [516]: a[np.hstack([np.arange(i,j) for i,j in zip(sindex, eindex)])]              
Out[516]: array([1, 2, 4, 5, 5, 6])

Since the slice size is all the same, we can generate one arange and step that with sindex:

In [521]: a[np.arange(eindex[0]-sindex[0]) + sindex[:,None]]                           
Out[521]: 
array([[1, 2],
       [4, 5],
       [5, 6]])

and then ravel. This is a more direct expression of @Divakar'sbroadcasting_app`.

With this small example, timings are similar.

In [532]: timeit np.hstack([a[i:j] for i,j in zip(sindex, eindex)])                  
13.4 µs ± 257 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [533]: timeit a[np.hstack([np.arange(i,j) for i,j in zip(sindex, eindex)])]       
21.2 µs ± 362 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [534]: timeit a[np.arange(eindex[0]-sindex[0])+sindex[:,None]].ravel()            
10.1 µs ± 48.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [535]: timeit strided_app(a, L=2, S=1)[sindex].ravel()                            
21.8 µs ± 207 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

strided_app and view_as_windows use striding tricks to view the array as an array of size d windows, and use sindex to select a subset of them.

In larger cases, relative timings may vary with the size of the slices versus the number of slices.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
0

You can just use sindex. Refer the following image

enter image description here

bigbounty
  • 16,526
  • 5
  • 37
  • 65
  • Sorry, in reality the difference between sindex and eindex isn't only 1. That was just for simplicity but I have edited the question. – Melron May 05 '19 at 12:55