Algorithm to total the combined length of segments

Question

I am looking into this problem:

Let's assume we have N segments on the X-axis with different start and end points. This is described with the next data structure:

[(start1, end1),..., (startN, endN)]

Now we want to calculate the whole size of such segments, but overlaps should be not be double counted.

Example

input: [(0,3),(1,2), (6,7)]
output: 4

Because segment (1,2) overlaps with segment (0,3), the distance between 0 and 3 is just 3, and then between 6 and 7 it is 1. So 3+1 = 4.

Question

Is there an algorithm to make this calculation for large input sizes?

This question is better suited for the computer sceince site: https://cs.stackexchange.com/help/on-topic — FlyingTeller, Feb 03 '22 at 12:05
Do you know a-priori the max x coordinate? How large is large (1000s, millions,...)? What's the definition of nested. Is `(3,5)` nested in the two lines `(2,4), (4,6)` — FlyingTeller, Feb 03 '22 at 12:06
This question should not have been closed (at least not for lacking debugging details). Not all on-topic questions are about a bug. — trincot, Feb 03 '22 at 12:42
Related question: [Union of intervals](https://stackoverflow.com/questions/1034802/union-of-intervals) — Stef, Feb 03 '22 at 13:54
A solution using sympy: `from sympy import Interval, Union`; `def combined_length(lst): return Union(*starmap(Interval, lst)).measure`; `print(combined_length([(0, 3), (1, 2), (6, 7)]))` — Stef, Feb 03 '22 at 13:57
Also related: [Union of multiple ranges](https://stackoverflow.com/a/34279227/3080723) — Stef, Feb 03 '22 at 14:05

score 2 · Accepted Answer · answered Feb 03 '22 at 12:14

You can proceed as follows:

Extract both coordinates of each pair and mark whether they end a segment (flag) or not (when they start one). Sort these new pairs (x, end). Then process them. When the coordinate starts a segment, push the coordinate on a stack. When it ends a segment, pop the corresponding start coordinate from the stack. If the stack is empty add the difference between end and start coordinate to the total:

lst = [(0,3),(1,2), (6,7)]

stack = []
total = 0
for x, end in sorted((x, i) for pair in lst for i, x in enumerate(pair)):
    if end:
        start = stack.pop()
        if not len(stack):
            total += x - start
    else:
        stack.append(x)

print(total)  # 4

Algorithm to total the combined length of segments

Example

Question

1 Answers1

Linked