How to get starting and ending indices of sub-arrays of consecutive integers from an array in Python?

Question

I'm new to programming, I have a numpy array as (the first column is the indices)

rows = np.array([5,6,7,8,14,15,16,31])

0 5
1 6
2 7
3 8
4 14
5 15
6 16
7 31

I need to get starting and ending indices of sub-arrays of consecutive integers, such as 0 and 3, 4 and 6 etc. I tried to do it like this

start = np.array([])
end = np.array([])
c = 0
while c < len(rows):
   for i in range(c, len(rows)):
      if rows[i]-rows[i+1] > 1:
        np.append(start, c)
        np.append(end, i)
        c = i+1

It doesn't work, any suggestions?

You can do this nicely with `itertools` and the third-party package `more_itertools`. See [Detecting consecutive integers in a list](https://stackoverflow.com/questions/2361945/detecting-consecutive-integers-in-a-list), [Identify groups of continuous numbers in a list](https://stackoverflow.com/questions/2154249/identify-groups-of-continuous-numbers-in-a-list). It's a trivial tweak to return the indices instead of the values. You also probably want to filter the output to only return sequences of length >= 2 — smci, Feb 16 '19 at 22:00
Also there is the useful `np.diff(rows)` which in your case gives you `array([1,1,1,6,1,1,15])`, so you can do `np.diff(rows) == 1`, then feed that into an iterator or while-loop. — smci, Feb 16 '19 at 22:04

score 1 · Accepted Answer · answered Feb 16 '19 at 22:43

Here's a one-line solution using itertools:

list( itertools.filterfalse(lambda i: (i>0) and (rows[i]-rows[i-1] == 1), range(len(rows))) )

[0, 4, 7]

How does this work?

we apply itertools.filterfalse() over the sequence of indices range(len(rows)), i.e. 0..(len(rows)-1)
filterfalse() will give the values where our chosen predicate function is false i.e. we want to see the indices where values are not consecutive. Hence we give it the function lambda i: (rows[i]-rows[i-1] == 1).
- We just need to tweak that so that it also evaluates to False at (i==0), hence we add the gating term: (i>0) and ...
finally we wrap all this in list(...) to convert the iterator back into a list

score 0 · Answer 2 · answered Feb 16 '19 at 20:20

0

To get first element of an array use: a[0], to get last element a[-1], where a is an array.

answered Feb 16 '19 at 20:20

Krzysztof Papciak

27
1
5

but I don't need to get only the first and last element – Rehim Alizadeh Feb 16 '19 at 20:26

Rory Daulton · Answer 3 · 2019-02-16T20:46:02.513

Here is one way. Note that I used Python lists to accumulate the desired indices and converted them to numpy arrays only at the end. I did this since numpy arrays are not designed to add new members--they work best with fixed sizes. There are more pythonic ways to do this, but I tried to keep with the knowledge that you showed in your question. One strange result of this code is that if rows is an empty array, start becomes array([0]) and end becomes array([-1]). My code works as expected for non-empty arrays.

import numpy as np

rows = np.array([5, 6, 7, 8, 14, 15, 16, 31])

startlist = [0]
endlist = []
for ndx in range(1, len(rows)):
    if rows[ndx] != rows[ndx - 1] + 1:
        startlist.append(ndx)
        endlist.append(ndx - 1)
endlist.append(len(rows) - 1)
start = np.array(startlist)
end = np.array(endlist)

The result of that is

start
Out[10]: array([0, 4, 7])

end
Out[11]: array([3, 6, 7])

@smci: Your solution is indeed much more compact than mine, but it uses multiple ideas that the OP may not know. Your solution also does not create the `end` array that the OP wants. (Yes, I know that it can be created from the `start` array--your solution would be improved if you show how.) — Rory Daulton, Feb 16 '19 at 23:44

How to get starting and ending indices of sub-arrays of consecutive integers from an array in Python?

3 Answers3