Since the input arrays are sorted, this should be fairly straightforward to work out. I'm assuming that the ranges in any one input array don't intersect one another (otherwise, "which are sorted" would be ambiguous). Consider one range from each array (defined by "current range" indexes a
and b
). There are several cases (each case other than "full overlap" has a mirror image where A
and B
are reversed):
No intersection:
A[a]: |------|
B[b]: |---|
Because the arrays are sorted, A[a]
cannot intersect anything in B
, so it can be skipped (increment a
).
Partial overlap (B[b]
extends beyond A[a]
):
A[a]: |-------|
B[b]: |-------|
In this case, add the intersection to the output and then increment a
because A[a]
cannot intersect anything else in B
.
Containment (possibly with coinciding ends):
A[a]: |------|
B[b]: |--|
Again add the intersection to the output and this time increment b
. Note that a further slight optimization is that if A[a]
and B[b]
end at the same value, then you can increment b
as well, since B[b]
also cannot intersect anything else in A
. (The case of coinciding ends could have been lumped into the partial overlap case. This case could then have been called "strict containment".)
Full overlap:
A[a]: |------|
B[b]: |------|
Add the intersection to the output and increment both a
and b
(neither range can intersect anything else in the other array).
Continue iterating the above until either a
or b
runs off the end of the corresponding array and you're done.
It should be trivial straightforward to translate the above to code.
EDIT: To back up that last sentence (okay, it wasn't trivial), here's my version of the above in code. It's a little tedious because of all the cases, but each branch is quite straightforward.
const A = [[1, 3], [7, 9], [12, 18]];
const B = [[2, 3], [4, 5], [6, 8], [13, 14], [16, 17]];
const merged = [];
var i_a = 0,
i_b = 0;
while (i_a < A.length && i_b < B.length) {
const a = A[i_a];
const b = B[i_b];
if (a[0] < b[0]) {
// a leads b
if (a[1] >= b[1]) {
// b contained in a
merged.push([b[0], b[1]]);
i_b++;
if (a[1] === b[1]) {
// a and b end together
i_a++;
}
} else if (a[1] >= b[0]) {
// overlap
merged.push([b[0], a[1]]);
i_a++;
} else {
// no overlap
i_a++;
}
} else if (a[0] === b[0]) {
// a and b start together
if (a[1] > b[1]) {
// b contained in a
merged.push([a[0], b[1]]);
i_b++;
} else if (a[1] === b[1]) {
// full overlap
merged.push([a[0], a[1]]);
i_a++;
i_b++;
} else /* a[1] < b[1] */ {
// a contained in b
merged.push([a[0], a[1]]);
i_a++;
}
} else /* a[0] > b[0] */ {
// b leads a
if (b[1] >= a[1]) {
// containment: a in b
merged.push([a[0], b[1]]);
i_a++;
if (b[1] === a[1]) {
// a and b end together
i_b++;
}
} else if (b[1] >= a[0]) {
// overlap
merged.push([a[0], b[1]]);
i_b++
} else {
// no overlap
i_b++;
}
}
}
console.log(JSON.stringify(merged));
You asked for an optimal algorithm. I believe mine is very close to optimal. It runs in linear time with the number of ranges in the two arrays, since each iteration completes the processing of at least one range (and sometimes two). It requires constant memory plus the memory required to build the result.
I should note that unlike the answer by CertainPerformance (the only other answer posted here at the time I'm writing this) my code works for any kind of numeric range data, not just integers. (You might want to replace ===
with ==
in the above if you're mixing numbers and string representations of numbers). The algorithm by CertainPerformance flattens the ranges into arrays of consecutive integers that span the ranges. If that total number of integers is n, then his algorithm runs in O(n2) time and O(n) space. (So, for instance, if one of the ranges were [1, 50000], that would require memory for 50,000 numbers and time proportional to the square of that.)