0

My problem is,is it possible to merge (unknown number of arrays) into a one array.so i dont know the number of arrays exactly(not like merging two array merging n array etc.) signature:

  > int [] merge(int k)//k is number of arrays to be merged into a one
    > merge them 
     //and so on..
    > 
    > return array;
rena-c
  • 325
  • 2
  • 3
  • 12
  • 1
    Cool task, and what have you tried? – Thomas Jungblut Mar 30 '12 at 09:47
  • 1
    you know the number of arrays. it is k :). Is there any rule while merging?? – UmNyobe Mar 30 '12 at 09:48
  • i am working on sorting array on a distrubited network.they splitted into k partitions,sorted at server side and merged at client side.problem is i dont know # of arrays to be merged. – rena-c Mar 30 '12 at 09:51
  • A k-way merge can be done efficiently via a priority queue (usually a binary heap) - as discussed here: http://stackoverflow.com/questions/5055909/algorithm-for-n-way-merge – Darren Engwirda Mar 30 '12 at 09:54

4 Answers4

4

Though it is possible to iteratively merge arrays, a more efficient way will be a k-way merge, which is done in O(nlogk) where k is the number of arrays and n is the total number of elements, using a priority queue.

Note: It cannot be done better then O(nlogk), because if it was possible [let's say with complexity O(g(n)) where g(n) is as asymptotically weaker then nlogn] - then we can produce the following sorting algorith:

sort(array A):
  split A into n arrays, each of size 1, B1,...,Bn
  A <- special_merge(B1,...Bn)
  return A

It is easy to see that this algorithm is of complexity O(g(n) + n) = O(g(n)), and we get a contraditcion, since we got a sort better then O(nlogn) - which is not possible with compartions based algorithms, since this problem is Omega(nlogn)

amit
  • 175,853
  • 27
  • 231
  • 333
3

If you know how to merge 2 arrays, then you can merge any number of arrays ;)

mrk
  • 3,061
  • 1
  • 29
  • 34
1

Sure, merge array 1 with array 2, merge array 1+2 with array 3, merge array 1+2+3 with array 4, continute until you have no arrays left. All you need is a method to merge 2 arrays, and a method to call this with a list of arrays until the list is empty.

High Performance Mark
  • 77,191
  • 7
  • 105
  • 161
  • That would be O(kn^2), much slower than the k-way merge suggested by amit. But then again - it looks like you need to know k to use k-way merge where as even though this method is slow you don't need to know k to implement it. – Dan Mar 30 '12 at 09:58
  • Hmmm, I think it's O(k*n*log(n)) but efficiency isn't everything: k-way merge is more efficient but I think you have to know k to start. – High Performance Mark Mar 30 '12 at 10:15
  • 1
    @Dan: There is no reason why you need to know `k` at compile time for k-way merge. Note that any algorithm will have to know `k` at run-time [or to count it, which is `O(k)`, so it is not an issue] Also, this implementation is `O(n*k)` and not `O(k*n^2)`, each merge OP is `O(n)`, and there are `k` of those. [`k` is the number of arrays, `n` is the total number of elements in all arrays combined] – amit Mar 30 '12 at 10:25
  • sorry typo I meant O(nk^2), merge(array 1, array 2) is O(n), merge(array1+2, array 3) is O(2n) etc.. so you have n + 2n + ... + (k-1)n. sum them up and you get k squared over 2. I guess what I meant about knowing k is that the k-way merge looks like you need to get all k at the same time and thus at some point you know what k is. Maybe I'm wrong about that? Where as this method doesn't require you to have any knowledge of k ever... – Dan Mar 30 '12 at 11:38
0

Yes, it is possible and for k arrays it's usually done using a priority queue of size k. The queue holds one element from each array (together with a tag for remembering which array the element came from.)

Initially the queue is filled with the minimal element from each array. Then you just iteratively remove the minimal element from the queue, and add next element to the queue, from the array the last minimal element came from.

Assuming the standard heap implementation of the pqueue, this yields O(nlog(k)) complexity.

Rafał Dowgird
  • 43,216
  • 11
  • 77
  • 90