From an array of indices how to create Python ranges?

Question

With numpy's where() function I'm getting the indices of an array by applying a mask. For example, an index list, where (1, 2, 3, 10, 11, 12, 13) where True:

[1, 2, 3, 10, 11, 12, 13]

Is there a way to transform this (or directly get it from a similar function) in separate ranges, for my example:

[ range(1, 4), range(10, 14)  ]

I am assuming you are not interested in simply iterating it? — Serial Lazer, Nov 10 '20 at 11:16
Does this answer your question? [Identify groups of continuous numbers in a list](https://stackoverflow.com/questions/2154249/identify-groups-of-continuous-numbers-in-a-list) — Flavio Moraes, Nov 10 '20 at 11:19
@FlavioMoraes It's not quite the same because it doesn't discuss a variant of identical problem on `numpy`. — mathfux, Nov 10 '20 at 11:46
I believe any algo will encounter ambiguity. Eg, with `[1, 3, 15, 5, 6]`, one possible answer could be `[range(1,4,2), range(15,16,1), range(5,7,1)]`, and another possible answer could be `[range(1,4,2), range(15,4,-10), range(6,7,1)]`. Are you ok with either of these answers? — fountainhead, Nov 10 '20 at 11:55
With `[1, 3, 15, 20, 21]`, a "greedy" algo will always (unambiguously) produce an answer of `[range(1,4,2), range(15,21,5), and range(21,22,1)]`. After identifying `range(1,4,2)`, when it sees `15`, it would "greedily" treat `15, 20` to be part of one new range (rather than let go of `20` to become a part of the next range). Would such a greedy approach be ok? — fountainhead, Nov 10 '20 at 12:08

score 2 · Accepted Answer · answered Nov 10 '20 at 12:24

2

Assuming all the ranges include unit differences only, it's worth to start with a search of indexes where to split:

split_idx = np.flatnonzero(np.diff(ls, prepend = ls[0], append=ls[-1])!=1)

Note that it is improved slightly in order to insert starting and ending indexes as well. A further work could be done like so:

bounding_points = np.transpose([split_idx[:-1], split_idx[1:]])
out = [range(ls[n], ls[m-1]+1) for n, m in bounding_points.tolist()]

Sample run:

>>> ls = [1, 2, 3, 10, 11, 12, 13, 16, 18, 19]
>>> split_idx
array([ 0,  3,  7,  8, 10], dtype=int32)
>>> bounding_points
array([[ 0,  3],
       [ 3,  7],
       [ 7,  8],
       [ 8, 10]], dtype=int32)
>>> out
[range(1, 4), range(10, 14), range(16, 17), range(18, 20)]

answered Nov 10 '20 at 12:24

mathfux

5,759
1
14
34

I believe you may be able to get the split points also by passing `n=2` to `np.diff()`, and comparing result of `np.diff()` to `0` rather than `1`. That would probably make the solution work even if the range steps are greater than 1. I would try it out myself, but my numpy is old, and doesn't even accept the `prepend` arg !! – fountainhead Nov 10 '20 at 12:39
@fountainhead `prepend` and `append` arguments are actually good for a demonstration purpose only. They are not efficient. You could do it better using `np.r_[item1, arr1, arr2, ..., arrn, item2]`. And you can't pass `n=2` inside [`np.diff`](https://numpy.org/doc/stable/reference/generated/numpy.diff.html) because it has a different meaning (read the docs). – mathfux Nov 10 '20 at 12:53
Ok, I'll check again. I thought passing `n=2` would produce second order differences (differences of differences). (Hence that suggestion) – fountainhead Nov 10 '20 at 12:54
@fountainhead I like extensible functions in answers, but I think extending to sequences with `step > 1` is adding a bit too much complexity based on the question – Daniel F Nov 10 '20 at 12:58
@DanielF -- I believe right now the assumption of step = 1 is based on OP's *example* rather than explicit statement of fact. Anyways, I'll leave this here. Thanks. – fountainhead Nov 10 '20 at 13:01

score 1 · Answer 2 · answered Nov 10 '20 at 12:12

1

reading the link I sent you I got this answer:

from operator import itemgetter
from itertools import groupby

data = [1, 2, 3, 10, 11, 12, 13]
ranges = []
for key, group in groupby(enumerate(data), lambda x: x[0] - x[1]):
    group = list(map(itemgetter(1), group))
    if len(group) > 1:
        ranges.append(range(group[0], group[-1]+1))
    else:
        ranges.append(group[0])

print(ranges)

the output was

[range(1, 4), range(10, 14)]

answered Nov 10 '20 at 12:12

Flavio Moraes

1,331
1
5
16

Will it identify ranges with step > 1 ? – fountainhead Nov 10 '20 at 12:18
no, because it is based on grouping consecutive numbers – Flavio Moraes Nov 10 '20 at 12:39

From an array of indices how to create Python ranges?

2 Answers2