Sort data
array, then for each interval - find the first index that is in this range in data
, and the last one (both using binary search). The number of elements that in this interval is than easily computed by reducing lastIdx-firstIdx
(or add +1
, depending if lastIdx
is inclusive or not).
This is done in O(mlogm + nlogm)
, where m
is the number of data
, and n
number of intervals.
Bonus: If data
is changing constantly, you can use an order statistics tree, with the same approach (since this tree allows you to find easily the index of each element, and is supporting modifying the data).
Bonus2: Optimality proof
Using comparisons based algorithms, this cannot be done better, since if we could, we could also solve element distinctness problem better.
Element Distinctness Problem:
Given an array a1,a2,...,an
- find out if there are i,j
such that
i!=j, ai=aj
.
This problem is known to have Omega(nlogn) time bound using comparisons based algorithms.
Reduction:
Given an instance of element distinctness problem a1,...,an
- create data=a1,...,an
, and intervals: [a1,a1], [a2,a2],..., [an,an]
- and run the algorithm.
If there are more than n
matches - there is duplicates, otherwise there is none.
The complexity of the above algorithm is O(n+f(n))
, where n
is the number of elements, and f(n)
is the complexity of this algorithm. this has to be Omega(nlogn)
, so does f(n)
, and we can conclude there is no more efficient algorithm.