How to intersect two arrays of ranges?

Question

Let a range be an array of two integers: the start and the end (e.g. [40, 42]).

Having two arrays of ranges (which are sorted), I want to find the optimal way to calculate their intersection (which will result into another array of ranges):

A = [[1, 3], [7, 9], [12, 18]]
B = [[2, 3], [4,5], [6,8], [13, 14], [16, 17]]

Intersection:

[[2, 3], [7, 8], [13, 14], [16, 17]]

What is the optimal algorithm for this?

The naive way would be to check each one with all the other ones, but that's obviously not optimal.

I found a similar question asking for the same thing in VBA: Intersection of two arrays of ranges

are all the ranges sorted and distinct? Could the array 1 be `[[4, 6], [1,2]]`? or `[[1, 5], [4,6]]`? — Liora Haydont, Mar 23 '18 at 16:06
Assuming arrays are sorted, this *seems* fairly trivial. Am I missing something? — Amit, Mar 23 '18 at 16:07
@PraveenKumar: the input arrays are just that: input, a given to the program. The intersection is also just that: the intersection of the sets of ranges, values that appear in both `A` and `B`'s ranges. — Prune, Mar 23 '18 at 16:07
@LioraHaydont No that can't be. They are normalized already. First would be `[[1, 2], [4, 6]]` and the second would be just `[[1, 6]]`. — Ionică Bizău, Mar 23 '18 at 16:08
Where are you unclear on the implementation? You cited another question that gives you the solution. The problem is solved in many postings you can find with a browser search. — Prune, Mar 23 '18 at 16:09
@Prune To me, the solution in VBA seems to be quite specific to the language: for instance, I don't know what `array1(lbound1)` is etc (I'm not a VB developer...). — Ionică Bizău, Mar 23 '18 at 16:10
I'm not sure why this is voted to be closed as too broad while the other question which is poorly formatted, but still very similar, is left open. And I don't know why this is *too broad* at all... — Ionică Bizău, Mar 23 '18 at 16:20
Duplicate of https://stackoverflow.com/questions/3269434/whats-the-most-efficient-way-to-test-two-integer-ranges-for-overlap — Salman A, Mar 23 '18 at 16:47
I don't believe that question answers it entirely. I spent some time puzzling with your problem. [Here's my solution](https://jsfiddle.net/fhaqeqkL/34/) :) — Exfridos, Mar 23 '18 at 17:06
Well, I would write you a nice answer if this wasn't on hold... but google "Sweep line algorithm" for a good approach to lots of problems like this one. — Matt Timmermans, Mar 23 '18 at 17:16

Ted Hopp · Accepted Answer · 2018-03-23T20:02:09.250

Since the input arrays are sorted, this should be fairly straightforward to work out. I'm assuming that the ranges in any one input array don't intersect one another (otherwise, "which are sorted" would be ambiguous). Consider one range from each array (defined by "current range" indexes a and b). There are several cases (each case other than "full overlap" has a mirror image where A and B are reversed):

No intersection:

A[a]: |------|
B[b]:          |---|

Because the arrays are sorted, A[a] cannot intersect anything in B, so it can be skipped (increment a).

Partial overlap (B[b] extends beyond A[a]):

A[a]: |-------|
B[b]:      |-------|

In this case, add the intersection to the output and then increment a because A[a] cannot intersect anything else in B.

Containment (possibly with coinciding ends):

A[a]: |------|
B[b]:   |--|

Again add the intersection to the output and this time increment b. Note that a further slight optimization is that if A[a] and B[b] end at the same value, then you can increment b as well, since B[b] also cannot intersect anything else in A. (The case of coinciding ends could have been lumped into the partial overlap case. This case could then have been called "strict containment".)

Full overlap:

A[a]: |------|
B[b]: |------|

Add the intersection to the output and increment both a and b (neither range can intersect anything else in the other array).

Continue iterating the above until either a or b runs off the end of the corresponding array and you're done.

It should be ~~trivial~~ straightforward to translate the above to code.

EDIT: To back up that last sentence (okay, it wasn't trivial), here's my version of the above in code. It's a little tedious because of all the cases, but each branch is quite straightforward.

const A = [[1, 3], [7, 9], [12, 18]];
const B = [[2, 3], [4, 5], [6, 8], [13, 14], [16, 17]];

const merged = [];

var i_a = 0,
    i_b = 0;

while (i_a < A.length && i_b < B.length) {
  const a = A[i_a];
  const b = B[i_b];

  if (a[0] < b[0]) {
    // a leads b
    if (a[1] >= b[1]) {
      // b contained in a
      merged.push([b[0], b[1]]);
      i_b++;
      if (a[1] === b[1]) {
        // a and b end together
        i_a++;
      }
    } else if (a[1] >= b[0]) {
      // overlap
      merged.push([b[0], a[1]]);
      i_a++;
    } else {
      // no overlap
      i_a++;
    }
  } else if (a[0] === b[0]) {
    // a and b start together
    if (a[1] > b[1]) {
      // b contained in a
      merged.push([a[0], b[1]]);
      i_b++;
    } else if (a[1] === b[1]) {
      // full overlap
      merged.push([a[0], a[1]]);
      i_a++;
      i_b++;
    } else /* a[1] < b[1] */ {
      // a contained in b
      merged.push([a[0], a[1]]);
      i_a++;
    }
  } else /* a[0] > b[0] */ {
    // b leads a
    if (b[1] >= a[1]) {
      // containment: a in b
      merged.push([a[0], b[1]]);
      i_a++;
      if (b[1] === a[1]) {
        // a and b end together
        i_b++;
      }
    } else if (b[1] >= a[0]) {
      // overlap
      merged.push([a[0], b[1]]);
      i_b++
    } else {
      // no overlap
      i_b++;
    }
  }
}
console.log(JSON.stringify(merged));

You asked for an optimal algorithm. I believe mine is very close to optimal. It runs in linear time with the number of ranges in the two arrays, since each iteration completes the processing of at least one range (and sometimes two). It requires constant memory plus the memory required to build the result.

I should note that unlike the answer by CertainPerformance (the only other answer posted here at the time I'm writing this) my code works for any kind of numeric range data, not just integers. (You might want to replace === with == in the above if you're mixing numbers and string representations of numbers). The algorithm by CertainPerformance flattens the ranges into arrays of consecutive integers that span the ranges. If that total number of integers is n, then his algorithm runs in O(n²) time and O(n) space. (So, for instance, if one of the ranges were [1, 50000], that would require memory for 50,000 numbers and time proportional to the square of that.)

Salman A · Answer 2 · 2018-03-24T14:05:44.943

The idea suggested by @Ted Hopp could be implemented in fewer lines of code as follows:

var A = [[1, 3], [7, 9], [12, 18]];
var B = [[2, 3], [4, 5], [6, 8], [13, 14], [16, 17]];

var result = [];
var ai = 0, alength = A.length, ax, ay;
var bi = 0, blength = B.length, bx, by;
while (ai < alength && bi < blength) {
  ax = A[ai][0];
  ay = A[ai][1];
  bx = B[bi][0];
  by = B[bi][1];
  if (ay < bx) {
    // a ends before b
    ai++;
  } else if (by < ax) {
    // b ends before a
    bi++;
  } else {
    // a overlaps b
    result.push([ax > bx ? ax : bx, ay < by ? ay : by]);
    // the smaller range is considered processed
    if (ay < by) {
      ai++;
    } else {
      bi++;
    }
  }
}
console.log(result);

Below is a comprehensive test with large arrays:

var A = [];
var B = [];
var R = [];
(function(rangeArray1, rangeArray2, bruteForceResult) {
  // create random, non-overlapping, sorted ranges
  var i, n, x, y;
  for (i = 0, n = 0; i < 1000; i++) {
    x = n += Math.floor(Math.random() * 100) + 1;
    y = n += Math.floor(Math.random() * 100);
    rangeArray1.push([x, y]);
  }
  for (i = 0, n = 0; i < 1000; i++) {
    x = n += Math.floor(Math.random() * 100) + 1;
    y = n += Math.floor(Math.random() * 100);
    rangeArray2.push([x, y]);
  }
  // calculate intersections using brute force
  rangeArray1.forEach(function(a) {
    rangeArray2.forEach(function(b) {
      if (b[1] >= a[0] && a[1] >= b[0]) {
        bruteForceResult.push([Math.max(a[0], b[0]), Math.min(a[1], b[1])]);
      }
    });
  });
})(A, B, R);

var result = [];
var ai = 0, alength = A.length, ax, ay;
var bi = 0, blength = B.length, bx, by;
while (ai < alength && bi < blength) {
  ax = A[ai][0];
  ay = A[ai][1];
  bx = B[bi][0];
  by = B[bi][1];
  if (ay < bx) {
    // a ends before b
    ai++;
  } else if (by < ax) {
    // b ends before a
    bi++;
  } else {
    // a overlaps b
    result.push([ax > bx ? ax : bx, ay < by ? ay : by]);
    // the smaller range is considered processed
    if (ay < by) {
      ai++;
    } else {
      bi++;
    }
  }
}
console.log(JSON.stringify(R) === JSON.stringify(result) ? "test passed" : "test failed");

I like this solution. It shows what a little creativity can do to simplify a bookkeeping problem. — Ted Hopp, Mar 25 '18 at 00:24

CertainPerformance · Answer 3 · 2018-03-23T16:42:07.910

Pretty straightforward, just a decent amount of code to write. Flatten a and b into individual elements instead of ranges, find their intersection, and turn it back into an array of ranges again.

const a = [[1, 3], [7, 9], [12, 18]];
const b = [[2, 3], [4,5], [6,8], [13, 14], [16, 17]];
const rangeToArr = ([start, end]) => Array.from({ length: end - start + 1 }, (_, i) => start + i);
const flat = inputArr => inputArr.reduce((arr, elm) => arr.concat(...elm), []);
const aRange = flat(a.map(rangeToArr));
const bRange = flat(b.map(rangeToArr));
const intersection = aRange.filter(num => bRange.includes(num));
console.log(intersection);


// Have the intersection of elements
// now we have to turn the intersection back into an array of ranges again:
const { partialIntersectionRange, thisRangeStarted, lastNum }
= intersection.reduce(({ partialIntersectionRange, thisRangeStarted, lastNum }, num) => {
  // Initial iteration only: populate with initial values
  if (typeof thisRangeStarted !== 'number') {
    return { partialIntersectionRange, thisRangeStarted: num, lastNum: num };
  }
  // If this element is a continuation of the range from the last element
  // then just increment lastNum:
  if (lastNum + 1 === num) {
    return { partialIntersectionRange, thisRangeStarted, lastNum: num };
  }
  // This element is not a continuation of the previous range
  // so make a range out of [thisRangeStarted, lastNum] and push it to the range array
  // (in case thisRangeStarted === lastNum, only push a single value)
  if (thisRangeStarted !== lastNum) partialIntersectionRange.push([thisRangeStarted, lastNum]);
  else partialIntersectionRange.push([thisRangeStarted]);
  return { partialIntersectionRange, thisRangeStarted: num, lastNum: num };
}, { partialIntersectionRange: [] });
if (thisRangeStarted !== lastNum) partialIntersectionRange.push([thisRangeStarted, lastNum]);
else partialIntersectionRange.push([thisRangeStarted]);

console.log(JSON.stringify(partialIntersectionRange));

The difficulty isn't the intersection logic, but getting it formatted the desired way.

Could you add somme comments to say what your code is doing? It's hard to read a block like this — Liora Haydont, Mar 23 '18 at 16:35
This is going to be pretty inefficient if the ranges contain huge numbers (e.g., `[5000030, 1470000250]`). Also, because it relies on `includes`, this seems like an O(n^2) algorithm, where n is the total number of integers spanned by all the ranges, not the number of ranges. — Ted Hopp, Mar 23 '18 at 16:39

How to intersect two arrays of ranges?

3 Answers3