0

May I ask whether would it be possible? the general approach would be somehow like find n-th value on two sorted array, to ignore the insignificants and try to focus on the rest by adjusting the value of n in recursion

The 2 sorted arrays problem would yield a computation time O(log(|A|)+log(|B|), while the question is similar, I would like to ask if there exist algorithm for m sorted arrays for time O(log(|A1|)+log(|A2|)+...+log(|Am|)),

or some similar variation that is near the time I mentioned above (due to the variable m, we might need some other sorting algorithm for the pivots from those arrays),

or if such algorithm doesn't exist, why?

I just can't find this algorithm from googling

orb
  • 175
  • 2
  • 13
  • You need to clarify the question. Do you mean nth-smallest, across all the m arrays? Or something else? – smci Mar 22 '14 at 03:27
  • @smci: What else would he/she mean? I don't see another sensible interpretation – Niklas B. Mar 22 '14 at 03:46
  • the abovementioned is exactly the case, thanks :p – orb Mar 22 '14 at 08:22
  • @Niklas-B: many other possibilities, since he/she went on to to say **'find n-th value on two sorted array'**. So it wasn't even clear whether they were talking about 2 or m arrays, or the intermediate mergesort of 2 from the m arrays, or what. Why not help edit the question for clarity? – smci Mar 22 '14 at 09:59
  • I knew I saw a similar question - so I googled for the title: http://stackoverflow.com/questions/8753345/finding-kth-smallest-number-from-n-sorted-arrays – Karoly Horvath Mar 22 '14 at 10:02

1 Answers1

0

There is a simple randomized algorithm:

  1. Select a pivot randomly from any of the m arrays. Let's call it x
  2. For every array, do a binary search for x to find out how many values < x are in the array. Say we have ri values < x in array i. We know that x has rank r = sum(i = 1 to m, ri) in the union of all arrays.
  3. If n <= r, we restrict each array i to the indices 0...(ri - 1) and recurse. If n > r, we restrict each array to the indices ri...|Ai | - 1
  4. repeat

The expected recursion depth is O(log(N)) with N being the total number of elements, with a proof similar to that of Quickselect, so the expected running time is something like O(m * log2(N)).

The paper "Generalized Selection and Ranking" by Frederickson and Johnson proposes selection and ranking algorithms for different scenarios, for example an O(m + c * log(k/c)) algorithm to select the k-th element from m equally sized sorted sequences, with c = min{m, k}.

Niklas B.
  • 92,950
  • 18
  • 194
  • 224
  • quite impressive, but i was actually expecting some algorithm that is as efficient as shown in this topic: [Find n-th smallest value in two sorted arrays](http://stackoverflow.com/questions/4686823/given-2-sorted-arrays-of-integers-find-the-nth-largest-number-in-sublinear-time) i.e. just extend the case to m arrays, simply brainstorming it should be possible in O(log(|A1|)+log(|A2|)+...+log(|Am|)) but how should the implementation be for variable m? I would add this to my topic, thanks for your work again > – orb Mar 22 '14 at 09:32
  • @orb have you had a look at http://stackoverflow.com/questions/6182488/median-of-5-sorted-arrays ? – Niklas B. Mar 22 '14 at 13:21