1

I have a list of tuples, which I am using to mark the lower and upper bounds of ranges. For example:

[(3,10), (4,11), (2,6), (8,11), (9,11)] # 5 separate ranges.

I want to find the ranges where three or more of the input ranges overlap. For instance the tuples listed above would return:

[(4,6), (8,11)]

I tried the method provided by @WolframH in answer to this post

But I don't know what I can do to:

  • Give me more than one output range

  • Set a threshold of at least three range overlaps to qualify an output

Community
  • 1
  • 1
joeqesi
  • 99
  • 8
  • 1
    Shouldn't that second tuple be `(8,11)`? Also, will the ranges always be bounded by integers? – Brionius Aug 28 '14 at 11:44
  • @Brionius yes it should. I'll just edit that now. – joeqesi Aug 28 '14 at 12:26
  • I think it should be (9,11) range(8,11) = [8,9,10] and this is only included in range(4,11) and range(8,11) – dnalow Aug 28 '14 at 12:26
  • As far as I can tell from 8-9 you get overlap from (3,10), (4,11), and (8,11). And then from 9-10 you get overlap from (4,11), (8,11) and (9,11). This covers 8, 9 and 10 giving a range of (8,11). – joeqesi Aug 28 '14 at 12:34
  • Ok then i misinterpreted your question. – dnalow Aug 28 '14 at 12:55

2 Answers2

1

You first have to find all combinations of ranges. Then you can transform them to sets and calculate the intersection:

import itertools


limits = [(3,10), (4,11), (2,6), (8,11), (9,11)]
ranges = [range(*lim) for lim in  limits]

results = []
for comb in itertools.combinations(ranges,3):
    intersection = set(comb[0]).intersection(comb[1])
    intersection = intersection.intersection(comb[2])
    if intersection and intersection not in results and\
       not any(map(intersection.issubset, results)):
        results = filter(lambda res: not intersection.issuperset(res),results)
        results.append(intersection)


result_limits =  [(res[0], res[-1]+1) for res in map(list,results)]

It should give you all 3-wise intersections

dnalow
  • 974
  • 4
  • 14
  • This comes close - it gives me `[set([4, 5]), set([8, 9]), set([9]), set([9, 10])]`. There needs to be some work to combine overlaps, and there's a fence vs. fenceposts problem. – Brionius Aug 28 '14 at 12:13
  • I see your point about the 3-wise intersections, but really it should be `[set([4, 6]), set([8, 9]), set([9]), set([9, 11])]` You're missing the end integer of the ranges. – Brionius Aug 28 '14 at 12:18
  • 1
    ok, one would have to add the following line: `result = [(res[0], res[-1]+1) for res in map(list,results)]` – dnalow Aug 28 '14 at 12:22
  • @dnalow this seems to work nicely but when I try to combine the overlaps with your list comprehension I get [(4, 6), (8, 10), (9, 10), (9, 11)] as the output. Is there any way to combine these ranges further? – joeqesi Aug 28 '14 at 12:36
  • OK i will add 2 lines to avoid the subsets – dnalow Aug 28 '14 at 12:41
0

You can, of course, solve this by brute-force checking all the combinations if you want. If you need this algorithm to scale, though, you can do it in (pseudo) nlogn. You can technically come up with a degenerate worst-case that's O(n**2), but whatchagonnado.

Basically, you sort the ranges, then for a given range you look to its immediate left to see that the bounds overlap, and if so you then look right to mark overlapping intervals as results. Pseudocode (which is actually almost valid python, look at that):

ranges.sort()

for left_range, current_range, right_range in sliding_window(ranges, 3):
  if left_range.right < current_range.left:
    continue
  while right_range.left < min(left_range.right, current_range.right):
      results.append(overlap(left_range, right_range))
      right_range = right_range.next
  #Before moving on to the next node, extend the current_range's right bound
  #to be the longer of (left_range.right, current_range.right)
  #This makes sense if you think about it.
  current_range.right = max(left_range.right, current_range.right)

merge_overlapping(results)

(you also need to merge some possibly-overlapping ranges at the end, this is another nlogn operation - though n will usually be much smaller there. I won't discuss the code for that, but it's similar in approach to the above, involving a sort-then-merge. See here for an example.)

Community
  • 1
  • 1
roippi
  • 25,533
  • 4
  • 48
  • 73