65

What is the best way in Python to determine what values in two ranges overlap?

For example:

x = range(1,10)
y = range(8,20)

(The answer I am looking for would be the integers 8 and 9.)
        

Given a range, x, what is the best way to iterate through another range, y and output all values that are shared by both ranges?

EDIT:

As a follow-up, I realized that I also need to know if x does or does not overlap y. I am looking for a way to iterate through a list of ranges and and do a number of additional things with range that overlap. Is there a simple True/False statement to accomplish this?

starball
  • 20,030
  • 7
  • 43
  • 238
drbunsen
  • 10,139
  • 21
  • 66
  • 94

11 Answers11

141

If the step is always +1 (which is the default for range) the following should be more efficient than converting each list to a set or iterating over either list:

range(max(x[0], y[0]), min(x[-1], y[-1])+1)
Andrew Clark
  • 202,379
  • 35
  • 273
  • 306
  • 14
    Really, this should be the accepted answer. It is a one line, O(1) intersection, whereas the others are O(n) at best, O(n^2) at worst. – Zoey Hewll May 24 '16 at 07:34
  • Some tests I ran indicate it may be marginally faster to check x[0] <= y[0] <= x[-1] etc etc, sequentially. But hardly significant. This is more elegant. – Neil Jan 08 '19 at 13:05
  • 9
    This solution doesn't give us the correct output when there is no overlap between the two ranges. – Pedram May 09 '19 at 19:47
  • 2
    Sure it does, if there is no overlap the result is an empty range as expected. – Andrew Clark May 13 '19 at 16:45
  • 4
    Why not `range(max(x.start, y.start), min(x.stop, y.stop))`? As a caveat, this solution only works if `x.step == 1 and y.step == 1`. – user2846495 Apr 01 '20 at 16:25
  • 2
    As @Pedram mentioned, this solution will also output a range even if there is no overlap, the range will have start and stop, but the range will in fact be empty. You can check if the range is empty by using `len(range)` -> will be 0 if there is no overlap – Boketto Apr 12 '21 at 13:52
  • What about decimals in the input numbers ?? Range doesn't allow for them. – lb_so Jun 02 '21 at 05:19
  • 1
    Am I going crazy or does this not work as expected? Trying it with `[11,20]` and `[1,10]` should produce no overlap, yet this spits out `range(11,10)`. EDIT: I suppose that's an invalid range, but the output is rather misleading. – turnip Sep 16 '22 at 10:08
82

Try with set intersection:

x = range(1,10)
y = range(8,20)
xs = set(x)
xs.intersection(y)   

Output:

set([8, 9])

Note that intersection accepts any iterable as an argument (y is not required to be converted to a set for the operation). There is an operator equivalent to the intersection method: & but, in this case, it requires both arguments to be sets.

joaquin
  • 82,968
  • 29
  • 138
  • 152
  • 1
    I find that intersection with operator `&` is much more intuitive. – schlamar Jul 25 '11 at 19:31
  • Fantastic, thanks for the help. I think the intersection method will work perfectly. Also, is there a way to just ask Python if any values or x are in y? A simple True/False statement? – drbunsen Jul 25 '11 at 20:12
  • @dr.bunsen look at http://stackoverflow.com/questions/6821329/instersection-or-object-for-object-in-set-if-object-in-other-set . You can also do `bool(x.intersection(y))` (being x, y sets) – joaquin Jul 25 '11 at 20:18
  • @joaquin Thank you very much for the help. I'm going to read the references you suggested and see if I can solve my problem. Thanks again! – drbunsen Jul 25 '11 at 20:30
  • 8
    @dr.bunsen This answer doesn't deserve all these upvotes, nor to be accepted. 1) This "solution" is 2.5 slower than the Andrew's solution (see comparison in my answer) 2) the result isn't sorted, since it is a set; that may be a drawback in some cases – eyquem Jul 25 '11 at 23:44
  • 4
    @eyquem maybe the OP feels more comfortable with the simplicity (you want intersection? then you do `intersection`), readability (no need for comments here) and the generality of this approach (it does not assume step 1 or same step for both ranges and you can use it for other iterables, not only for ranges). Speed is not always the more important factor (although the `itersection` approach is faster than the plaes `&` and much more faster than list comprehension (python 2.6), both still very interesting approaches,imho. Anyway, we are talking of several microseconds to execute the code here). – joaquin Jul 26 '11 at 13:30
  • @joaquin - If he feels more confortable with **intersection**, OK - I admit that **intersection** is more readable - I admit that your solution is more general; but is it necessary to do more than required ? You want step=1, you do step=1 ! - Speed: it depends; I imagine that the real case may have to do with long ranges and speed may be an important criterium. -- Personnaly, I presume that, the starting objects being ranges, we might think that it's quasi certain that it's required that the result is also ordered as a range. You didn't mention this point. – eyquem Jul 26 '11 at 14:16
  • 2
    Thanks for the comments and discussion. I ultimately choose this answer because of readability and the added benefit of duplicate removal with `set()`. – drbunsen Jul 26 '11 at 16:39
  • 5
    You would not want to do that to find intersection between `range(0, 21984762198598125721534)` and `range(21984762198598125721123, 45823914893546328945328456)`. There is an O(1) solution in answer by Andrew Clark, which would be the way to go. – zvone Jul 01 '20 at 10:22
16

You can use sets for that, but be aware that set(list) removes all duplicate entries from the list:

>>> x = range(1,10)
>>> y = range(8,20)
>>> list(set(x) & set(y))
[8, 9]
plaes
  • 31,788
  • 11
  • 91
  • 89
  • 1
    Well, `range`s don't contain duplicates anyway. Also, using `xrange` is possible saves some (potentially much, depending on range sizes) memory during construction. –  Jul 25 '11 at 19:30
15

The answers above seem mostly overly complex. This one liner works perfectly in Python3, takes ranges as inputs and output. It also handles illegal ranges. To get the values just iterate over the result if not None.

# return overlap range for two range objects or None if no ovelap
# does not handle step!=1
def range_intersect(r1, r2):
    return range(max(r1.start,r2.start), min(r1.stop,r2.stop)) or None
Robotbugs
  • 4,307
  • 3
  • 22
  • 30
13

If you looking for the overlap between two real-valued bounded intervals, then this is quite nice:

def overlap(start1, end1, start2, end2):
    """how much does the range (start1, end1) overlap with (start2, end2)"""
    return max(max((end2-start1), 0) - max((end2-end1), 0) - max((start2-start1), 0), 0)

I couldn't find this online anywhere so I came up with this and I'm posting here.

Yetti
  • 311
  • 2
  • 7
11

One option is to just use list comprehension like:

x = range(1,10) 
y = range(8,20) 

z = [i for i in x if i in y]
print z
TimothyAWiseman
  • 14,385
  • 12
  • 40
  • 47
  • 2
    As far as I can see xrange.__contains__ from Python 2.x doesn't have this optimization. This is sloow under python 2.7: rng = xrange(20, 1000000000); 10 in rng – Mikhail Korobov Jun 13 '12 at 10:25
  • 2
    '10 in rng' (False) is slow and '100 in rng' (True) is fast under Python 2.7; both are fast under Python 3.2. – Mikhail Korobov Jun 13 '12 at 10:32
7

This is the answer for the simple range with step=1 case (99% of the time), which can be 2500x faster as shown in the benchmark when comparing long ranges using sets (when you are just interested in knowing if there is overlap):

x = range(1,10) 
y = range(8,20)

def range_overlapping(x, y):
    if x.start == x.stop or y.start == y.stop:
        return False
    return x.start <= y.stop and y.start <= x.stop

>>> range_overlapping(x, y)
True

To find the overlapping values:

def overlap(x, y):
    if not range_overlapping(x, y):
        return set()
    return set(range(max(x.start, y.start), min(x.stop, y.stop)+1))

Visual help:

|  |           |    |
  |  |       |    |

Benchmark:

x = range(1,10)
y = range(8,20)

In [151]: %timeit set(x).intersection(y)
2.74 µs ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [152]: %timeit range_overlapping(x, y)
1.4 µs ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Conclusion: even for small ranges, it is twice as fast.

x = range(1,10000)
y = range(50000, 500000)

In [155]: %timeit set(x).intersection(y)
43.1 ms ± 158 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [156]: %timeit range_overlapping(x, y)
1.75 µs ± 88.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Conclusion: you want to use the range_overlapping function in this case as it is 2500x faster (my personal record in speedup)

PascalVKooten
  • 20,643
  • 17
  • 103
  • 160
  • 1
    I think in `range_overlapping`, it might be sufficient to check `x.start <= y.stop and y.start <= x.stop`. As a matter of fact, I think the two expressions you use (separated by `or`) are equivalent. – Clint Chelak Dec 22 '20 at 04:49
5

For "if x does or does not overlap y" :

for a,b,c,d in ((1,10,10,14),
                (1,10,9,14),
                (1,10,4,14),
                (1,10,4,10),
                (1,10,4,9),
                (1,10,4,7),
                (1,10,1,7),
                (1,10,-3,7),
                (1,10,-3,2),
                (1,10,-3,1),
                (1,10,-11,-5)):
    x = range(a,b)
    y = range(c,d)
    print 'x==',x
    print 'y==',y
    b = not ((x[-1]<y[0]) or (y[-1]<x[0]))
    print '    x %s y' % ("does not overlap","   OVERLAPS  ")[b]
    print

result

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [10, 11, 12, 13]
    x does not overlap y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [9, 10, 11, 12, 13]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8, 9]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [1, 2, 3, 4, 5, 6]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0, 1, 2, 3, 4, 5, 6]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0, 1]
    x    OVERLAPS   y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0]
    x does not overlap y

x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-11, -10, -9, -8, -7, -6]
    x does not overlap y

Edit 1

Speeds comparison:

from time import clock

x = range(-12,15)
y = range(-5,3)
te = clock()
for i in xrange(100000):
    w = set(x).intersection(y)
print '                     set(x).intersection(y)',clock()-te


te = clock()
for i in xrange(100000):
    w = range(max(x[0], y[0]), min(x[-1], y[-1])+1)
print 'range(max(x[0], y[0]), min(x[-1], y[-1])+1)',clock()-te

result

                     set(x).intersection(y) 0.951059981087
range(max(x[0], y[0]), min(x[-1], y[-1])+1) 0.377761978129

The ratio of these execution's times is 2.5

Lesmana
  • 25,663
  • 9
  • 82
  • 87
eyquem
  • 26,771
  • 7
  • 38
  • 46
1

If you want to find the overlap of ranges with arbitrary steps you can use my package https://github.com/avnr/rangeplus which provides a Range() class compatible with Python range(), plus some goodies including intersections:

>>> from rangeplus import Range
>>> Range(1, 100, 3) & Range(2, 100, 4)
Range(10, 100, 12)
>>> Range(200, -200, -7) & range(5, 80, 2)  # can intersect with Python range() too
Range(67, 4, -14)

Range() can also be unbound (when stop is None the Range goes on to +/-infinity):

>>> Range(1, None, 3) & Range(3, None, 4)
Range(7, None, 12)
>>> Range(253, None, -3) & Range(208, 310, 5)
Range(253, 207, -15)

The intersection is computed, not iterated, which makes the efficiency of the implementation independent of the length of the Range().

avnr
  • 614
  • 5
  • 12
1

This solution generates integers that are in the intersection of an arbitrary number of range objects in O(1) memory.

Disclosure: I got this from a user in Python Chat after I tried something else... less elegant.

Solution

def range_intersection(*ranges):
    ranges = set(ranges)  # `range` is hashable so we can easily eliminate duplicates
    if not ranges: return
    
    shortest_range = min(ranges, key=len)  # we will iterate over one, so choose the shortest one
    ranges.remove(shortest_range)          # note: `range` has a length, so we can use `len`
    
    for i in shortest_range:
        if all(i in range_ for range_ in ranges): yield i  # Finally, `range` implements `__contains__`
                                                           # by checking if an iteger satisfies it's simple formula

Testing

OP's problem

x = range(1,10)
y = range(8,20)
list(range_intersection(x, y))

[8, 9]

My example

limit = 10_000
list(range_intersection(
    range(2, limit, 2),
    range(3, limit, 3),
    range(5, limit, 5),
    range(41, limit, 41),
))

[1230, 2460, 3690, 4920, 6150, 7380, 8610, 9840]
piRSquared
  • 285,575
  • 57
  • 475
  • 624
0

Assuming you are working exclusively with ranges, with a step of 1, you can do it quickly with math.

def range_intersect(range_x,range_y):
    if len(range_x) == 0 or len(range_y) == 0:
        return []
    # find the endpoints
    x = (range_x[0], range_x[-1]) # from the first element to the last, inclusive
    y = (range_y[0], range_y[-1])
    # ensure min is before max
    # this can be excluded if the ranges must always be increasing
    x = tuple(sorted(x))
    y = tuple(sorted(y))
    # the range of the intersection is guaranteed to be from the maximum of the min values to the minimum of the max values, inclusive
    z = (max(x[0],y[0]),min(x[1],y[1]))
    if z[0] < z[1]:
        return range(z[0], z[1] + 1) # to make this an inclusive range
    else:
        return [] # no intersection

On a pair of ranges each with over 10^7 elements, this took under a second, independent of how many elements overlapped. I tried with 10^8 or so elements, but my computer froze for a while. I doubt you'd be working with lists that long.

Zoey Hewll
  • 4,788
  • 2
  • 20
  • 33