2

What is the optimal algorithm for selecting top n elements from multiple arrays, provided each array is sorted the same way in which the resultant array should be.

Reading elements is very expensive and therefore the number of reads should be an absolute minimum.

Bertram Gilfoyle
  • 9,899
  • 6
  • 42
  • 67
  • Possible duplicate of [What is the most efficient way to ge the top N items of the union of M sorted sets](https://stackoverflow.com/questions/24138641/what-is-the-most-efficient-way-to-ge-the-top-n-items-of-the-union-of-m-sorted-se) – juvian Aug 31 '18 at 17:48

1 Answers1

4

Put tuples (current_element, array_number, current_index=0) into priority queue (for example, based on binary max-heap), ordered by element value

Then remove top of the queue n times.

After removing increment index in corresponding array (if possible), get the next element and insert updated tuple into queue again

MBo
  • 77,366
  • 5
  • 53
  • 86
  • I was just typing up this answer, and you beat me to it. However I'll add that you need to make it a max-heap. In Python you'd do that by putting `current_element` first in the tuple, then following the advice in https://stackoverflow.com/questions/2501457/what-do-i-use-for-a-max-heap-implementation-in-python to get a max-heap. – btilly Aug 31 '18 at 17:52
  • @btilly If you think your answer is relevant, please do not hesitate to post it. – Bertram Gilfoyle Aug 31 '18 at 17:54
  • 1
    @Anees It is the same exact strategy, so this answer makes it redundant. – btilly Aug 31 '18 at 22:13
  • For python: `heapq.merge(*sorted_arrays, key=lambda x: x, reverse=False)[:n]` – BorjaEst Dec 14 '22 at 12:00