find nth-smallest value across m sorted arrays using idea from 2 sorted arrays

Question

May I ask whether would it be possible? the general approach would be somehow like find n-th value on two sorted array, to ignore the insignificants and try to focus on the rest by adjusting the value of n in recursion

The 2 sorted arrays problem would yield a computation time O(log(|A|)+log(|B|), while the question is similar, I would like to ask if there exist algorithm for m sorted arrays for time O(log(|A1|)+log(|A2|)+...+log(|Am|)),

or some similar variation that is near the time I mentioned above (due to the variable m, we might need some other sorting algorithm for the pivots from those arrays),

or if such algorithm doesn't exist, why?

I just can't find this algorithm from googling

You need to clarify the question. Do you mean nth-smallest, across all the m arrays? Or something else? — smci, Mar 22 '14 at 03:27
@smci: What else would he/she mean? I don't see another sensible interpretation — Niklas B., Mar 22 '14 at 03:46
@Niklas-B: many other possibilities, since he/she went on to to say **'find n-th value on two sorted array'**. So it wasn't even clear whether they were talking about 2 or m arrays, or the intermediate mergesort of 2 from the m arrays, or what. Why not help edit the question for clarity? — smci, Mar 22 '14 at 09:59
I knew I saw a similar question - so I googled for the title: http://stackoverflow.com/questions/8753345/finding-kth-smallest-number-from-n-sorted-arrays — Karoly Horvath, Mar 22 '14 at 10:02

Niklas B. · Accepted Answer · 2014-03-22T03:46:09.667

0

There is a simple randomized algorithm:

Select a pivot randomly from any of the m arrays. Let's call it x
For every array, do a binary search for x to find out how many values < x are in the array. Say we have r_i values < x in array i. We know that x has rank r = sum(i = 1 to m, r_i) in the union of all arrays.
If n <= r, we restrict each array i to the indices 0...(r_i - 1) and recurse. If n > r, we restrict each array to the indices r_i...|A_i | - 1
repeat

The expected recursion depth is O(log(N)) with N being the total number of elements, with a proof similar to that of Quickselect, so the expected running time is something like O(m * log²(N)).

The paper "Generalized Selection and Ranking" by Frederickson and Johnson proposes selection and ranking algorithms for different scenarios, for example an O(m + c * log(k/c)) algorithm to select the k-th element from m equally sized sorted sequences, with c = min{m, k}.

edited Mar 22 '14 at 03:46

answered Mar 22 '14 at 03:30

Niklas B.

92,950
18
194
224

quite impressive, but i was actually expecting some algorithm that is as efficient as shown in this topic: [Find n-th smallest value in two sorted arrays](http://stackoverflow.com/questions/4686823/given-2-sorted-arrays-of-integers-find-the-nth-largest-number-in-sublinear-time) i.e. just extend the case to m arrays, simply brainstorming it should be possible in O(log(|A1|)+log(|A2|)+...+log(|Am|)) but how should the implementation be for variable m? I would add this to my topic, thanks for your work again > – orb Mar 22 '14 at 09:32
@orb have you had a look at http://stackoverflow.com/questions/6182488/median-of-5-sorted-arrays ? – Niklas B. Mar 22 '14 at 13:21

find nth-smallest value across m sorted arrays using idea from 2 sorted arrays

1 Answers1