Find gaps in list of range values

Question

I found numerous similar questions in other programming languages (ruby, C++, JS, etc) but not for Python. Since Python has e.g. itertools I wonder whether we can do the same more elegantly in Python.

Let's say we have a "complete range", [1,100] and then a subset of ranges within/matching the "complete range":

[10,50]
[90,100]

How can we extract the not covered positions, in this case [1,9], [51,89]?

This is a toy example, in my real dataset I have ranges up to thousands.

@deadshot I would do a for loop, interating over `1:max` but that's totally infeasible for larger ranges — CodeNoob, Sep 09 '20 at 14:42
@deadshot I'm also thinking about `itertools.chain.from_iterable` but I'm not familiar with that implementation at all — CodeNoob, Sep 09 '20 at 14:49
does the ranges include overlapping ranges? like `[10, 50], [30, 70], [90, 100]`? — deadshot, Sep 09 '20 at 14:58
`[[1, a[0][0] - 1]] + [[a[x - 1][1] + 1, a[x][0] - 1] for x in range(1, len(a))]` — deadshot, Sep 09 '20 at 15:06
@deadshot Thanks for contributing to duplicate questions. Thanks for giving the opportunity to a bored high rep user, instead of learning python, to be spoonfed the answer, with no rewards at all to the people who spoonfeed him, such as yourself. — solid.py, Sep 10 '20 at 08:07

score 6 · Accepted Answer · answered Sep 09 '20 at 15:13

Here is a neat solution using itertools.chain: I've assumed the input ranges don't overlap. If they do, they need to be simplified first using a union-of-ranges algorithm.

from itertools import chain

def range_gaps(a, b, ranges):
    ranges = sorted(ranges)
    flat = chain((a-1,), chain.from_iterable(ranges), (b+1,))
    return [[x+1, y-1] for x, y in zip(flat, flat) if x+1 < y]

Taking range_gaps(1, 100, [[10, 50], [90, 100]]) as an example:

First sort the ranges in case they aren't already in order. If they are guaranteed to be in order, this step is not needed.
Then flat is an iterable which will give the sequence 0, 10, 50, 90, 100, 101.
Since flat is lazily evaluated and is consumed by iterating over it, zip(flat, flat) gives a sequence of pairs like (0, 10), (50, 90), (100, 101).
The ranges required are then like (1, 9), (51, 89) and the case of (100, 101) should give an empty range so it is discarded.

Liju · Answer 2 · 2020-09-09T16:41:00.650

Assuming the list contains only integers, and the sub-ranges are in increasing order and not overlapping, You can use below code.

This code will take all sub ranges one by one, and will compare with original complete range and the sub range before it, to find the missing range.

[start,end]=[1,100]
chunks=[[25,31],[7,15],[74,83]]

print([r for r in [[start,chunks[0][0]-1] if start!=chunks[0][0] else []] + [[chunks[i-1][1]+1, chunks[i][0]-1] for i in range(1,len(chunks))]+[[chunks[-1][1]+1,end] if end!=chunks[-1][1] else []] if r])

Input

[1,100]
[[7,15],[25,31],[74,83]]

Output

[[1, 6], [16, 24], [32, 73], [84, 100]]

If increasing order of sub ranges are not guaranteed. you can include below line to sort chunks.

chunks.sort(key=lambda x: x[0])

IoaTzimas · Answer 3 · 2020-09-09T15:08:40.383

0

This is a generic solution:

def gap(N, ranges):
    ranges=[(min1, max1), (min2, (max2), ......, (minn, maxn)]
    
    original=set(range(N))
           
    for i in ranges:
        original=original-set(range(i[0], i[1]))

    return original

edited Sep 09 '20 at 15:08

answered Sep 09 '20 at 14:44

IoaTzimas

10,538
2
13
30

how this answer OP question? what if you have 100 ranges? – deadshot Sep 09 '20 at 14:44
you just subtract them too – IoaTzimas Sep 09 '20 at 14:47
1

I doubt whether this would be feasible on large ranges, what if I encounter `1:10000000` this would require to generate huge lists – CodeNoob Sep 09 '20 at 14:47
No matter the lengths. You can work with all lengths and all combinations of subranges with this logic. Adding some comprehensions if needed – IoaTzimas Sep 09 '20 at 14:49
@archer 100 substract this is really like writing a book – deadshot Sep 09 '20 at 14:49
I don't understand your problem with the solution. – IoaTzimas Sep 09 '20 at 14:51
2

your solution should be generic. your solution need to change even if we add or remove one range – deadshot Sep 09 '20 at 14:53
@acher this is a toy example, what if I have 1000 ranges then I have to write -set(x)-set(x)-set(x)-set(x).....1000 – CodeNoob Sep 09 '20 at 14:53
@archer I keep it simple for illustration purposes – CodeNoob Sep 09 '20 at 15:02
Can you explain what exactly don't you like at the soultion i provided? I guess ranges must come with some format (list of tuples, etc). My solution works with any kind of ranges and Ns, so why all this negativity, even a downvote??? – IoaTzimas Sep 09 '20 at 15:07

Find gaps in list of range values

3 Answers3