4

I have a sorted list of overlapping intervals, intervals are never contained in each other, e.g.,

[(7, 11), (9, 14), (12,  17)]

The constraint for the output is to keep every element as close as possible to its origin (the middle of the interval), preserve the order of the input, and remove all overlap. Only an approximate solution is necessary. The expected result for the example input would be:

[(5,9), (9, 14), (14, 19)]

I'm only aware of solutions that go about this in some simulation style: shift each element by some value in a free direction and iterate until all overlap has been removed.

Is there an existing algorithm to solve this?

pmr
  • 58,701
  • 10
  • 113
  • 156
  • averages is what you want, i mean, if you want it as close as possible to the original – galchen Nov 20 '11 at 23:34
  • Is the midpoint of each new interval supposed to equal the corresponding midpoint of the old one? – Kerrek SB Nov 20 '11 at 23:36
  • Can one interval be contained inside another? Is it necessary to preserve the order of the intervals in the list? – Per Nov 20 '11 at 23:45
  • @KerrekSB No, the new midpoints can be chosen freely but should be as close to the old one as possible. To paraphrase: `The average of all midpoints differences between old and new should be as small as possible.` The size of each average should remain the same. – pmr Nov 20 '11 at 23:45
  • @Per No, intervals can not be contained in each other and the order should be preserved. – pmr Nov 20 '11 at 23:46
  • Just curious.. What is the application of this? – Venkatesh Kumar Nov 21 '11 at 00:32
  • Does each consecutive pair of intervals have intersection of positive measure? If so, the algorithm is very simple: keep (a/the) center interval (center = equal or equal plus one numbers of intervals to the left and to the right) where it is, slide the intervals left of center to the left and the intervals right of center to the right. – Per Nov 21 '11 at 03:38

2 Answers2

2

find the overall average:

in our example:

(7 + 11 + 9 + 14 + 12 + 17)/6 = 11.667

find the total length:

(11-7) + (14-9) + (17-12) = 4 + 5 + 5 = 14;

find the new min/max;

14/2 = 7
11.667 - 7 = 4.667
11.667 + 7 = 18.667

you can round 'em

4.667 ~ 5
18.667 ~ 19

start from the min, creating the sections by the intervals

(5, (11-7)+5) = (5,9)
(9, (14-9)+9) = (9,14)
(14, (17-12)+14) = (14,19)

NOTE:

this method will not keep the elements as equal as possible to the originals, but will keep them as close as possible to the original considering their relative values (preserving the center)

EDIT:

if you want to keep the averages of all intervals as close as possible to the original, you can implement a mathematical solution.

our problem's input is:

a1=(a1,1, a1,2) , ... , an=(an,1,an,2)

we will define:

ai1 = a1,2-a1,1 // define the intervals

b1 = (d, d+ai1)

bn = (d + sum(ai1..ain-1), d + sum(ai1..ain) )

bi1 = b1,2-b1,1 // define the intervals

we need to find a 'd' such as:

s = sum( abs((a1,1+a1,2)/2 - (b1,1+b1,2)/2) )

min(s) is what we want

in our example:

a1 = (7,11), ai1 = 4, Aavg1 = 9

a2 = (9,14), ai2 = 5, Aavg2 = 11.5

a3 = (12,7), ai3 = 5, Aavg3 = 14.5

b1 = (d, d+4) Bavg1 = d+2

b2 = (d+4, d+9) Bavg2 = d+6.5

b3 = (d+9, d+14) Bavg3 = d+11.5

s = abs(9-(d+2)) + abs(11.5-(d+6.5)) + abs(14.5-(d+11.5)) = abs(7-d) + abs(5-d) + abs(3-d)

now calculcate the derivative to find min/max OR iterate over d to get a result. in our case you will need to iterate from 3 to 7

that should do the trick

Community
  • 1
  • 1
galchen
  • 5,252
  • 3
  • 29
  • 43
0

Given that the solution must be order-preserving, we can formulate this problem as a linear program. Let [ai, bi] be the ith interval. Let variables xi be the left shift of the ith interval and yi be the right shift of the ith interval.

minimize sumi (xi + yi)
subject to
(*) for all i: bi - xi + yi ≤ ai+1 - xi+1 + yi+1
for all i: xi, yi ≥ 0

Rewrite constraint (*) by introducing a variable zi.

for all i: xi - yi - xi+1 + yi+1 - zi = 0
for all i: zi ≥ bi - ai+1

Now the problem is reduced to computing a minimum-cost circulation, which can be done in poly-time. I have a feeling, however, that there's a more direct solution to this problem.

The graph looks something like

        (*)
    ---- | ----
   /    z|     \
  /     i|      \
 /   xi  |  xi+1 \
|/ <---- v <---- \|
...     (*)     ...
   ---->   ---->
    yi     yi+1
Per
  • 2,594
  • 12
  • 18