First, sort the intervals within each passage to make sure they are in order of start time, and merge any overlapping intervals within each passage.
passage1 = (566,579),(573,583.33)
passage2 = (574,579.21),(614,620)
passage3 = (566,579),(570,590)
passage4 = (572,575),(577,620)
def merge_overlapping_intervals(intervals):
intervals = iter(sorted(intervals))
start, end = next(intervals)
for next_start, next_end in intervals:
if next_start <= end <= next_end:
end = next_end
else:
yield start, end
start = next_start
end = next_end
yield start, end
print(list(merge_overlapping_intervals(passage1))) # 566, 583.33
Then, find all overlaps between intervals in passage 1 with intervals in passage 2:
def interval_overlaps(intervals1, intervals2):
""" Yield all overlaps of intervals in sequence intervals1 with intervals in sequence intervals2 """
for start1, end1 in merge_overlapping_intervals(intervals1):
for start2, end2 in merge_overlapping_intervals(intervals2):
if end1 > start2 and end2 > start1:
yield max(start1, start2), min(end1, end2)
print(list(interval_overlaps(passage1, passage2))) # [(574, 579.21)]
print(list(interval_overlaps(passage3, passage4))) # [(572, 575), (577, 590)]
Equivalently, you could identify all overlaps first, and then sort them and merge any overlapping overlaps.
Because of the nested loop, the time-efficiency of this is O(nm) where n and m are the number of segments in each passage. A more time-efficient algorithm for identifying overlaps should be possible, given that the passages are sorted, but would require a bit more thought to write, and I imagine may not be needed.